BULLETIN OF THE 
SCHOOL OF EDUCATION 
INDIANA UNIVERSITY 


Vol. VII BLOOMINGTON, INDIANA No. 5 


May, 1931 


Entered as second-class mail matter, September 30, 1924, at the postoffice at 
Bloomington, Indiana, under the Act of August 24, 1912. Published six times a year, 
from the University Office, Bloomington, Indiana. 


Eighteenth Annual Conference 


on 


Educational Measurements 


Held at Indiana University 


April 24 and 25, 1931 


Bureau of Codperative Research, Indiana University School of 
Education 


HENRY LESTER SMITH, Director 


Fi 


Qr 


- 
7 
a 
ad 
a 
| 
# 
A 
pe 
cr 


Price, 50 cents. For sale by the University Bookstore, Bloom- 
ington, Indiana. 


A limited number of copies of this bulletin will be distributed free 
of charge. 


The contents of this issue are listed in the following: 
Loyola Digest 
Wilson Education Index 


4 

i 

eit 


— 


‘= 

— 


BULLETIN OF THE SCHOOL OF 
EDUCATION, INDIANA UNIVERSITY 


Eighteenth Annual Conference 


on 


Educational Measurements 


Held at Indiana University 
April 24 and 25, 1931 


INDIANA UNIVERSITY 


EDUCATION 
READING ROOM 
Published by the 


BUREAU OF COOPERATIVE RESEARCH 


INDIANA UNIVERSITY 


| 

| 

jy, 


Table of Contents 


PAGE 


A STUDY OF THE PREDICTIVE VALUE OF MUSIC TALENT 
TESTS FOR TEACHER TRAINING PURPOSES, Lowell M. Tilson, 
Head of Department of Music, Indiana State Teachers College, Terre 


A STUDY IN METHODS OF COLLEGE TEACHING, A. R. Eiken- 
berry, Professor of Education, Manchester College, North Manchester 


THE STATE HIGH SCHOOL TESTING PROGRAM—SOME FIRST 


RESULTS, H. H. Remmers, Associate Professor of Education and 
Psychology, Purdue University 


THE GROWING DEMAND FOR RESEARCH WORKERS IN 
BUREAUS OF EDUCATIONAL RESEARCH, Henry Lester Smith, 
Dean of the School of Education, Indiana University 


AN EXPERIMENT WITH THE LECTURE METHOD IN COLLEGE 
TEACHING, J. R. Shannon, Professor of Education, Indiana State 
Teachers College, Terre Haute 


THE PSYCHOLOGICAL EXAMINATIONS AT THE INDIANA 
STATE TEACHERS COLLEGE, J. W. Jones, Director, Division of 
Research, Indiana State Teachers College, Terre Haute 


PREDICTING SCHOLARSHIP IN THE JUNIOR HIGH SCHOOL, 
Fowler D. Brooks, Chairman of the Departments of Education and 
Psychology, DePauw University 


2—47732 (3) 


21 


33 


48 


| 
| 
| 
> 
28 
= 
58 
| 
73 


A Study of the Predictive Value of Music 
Talent Tests for Teacher Training 
Purposes 


LOwELL M. TILSON, Head of Department of Music, Indiana State 
Teachers College, Terre Haute 


AT its 1930 meeting in Chicago the Music Supervisors’ National 
Conference passed the following resolution: “Be it resolved, That Insti- 
tutions for the training of teachers and supervisors of school music be 
urged to exercise greater care in the selection of students who seek to 
undertake this training, by demanding not only that they have adequate 
previous musical study but also the assurance that they possess possibili- 
ties of necessary future development.” 

If there are great individual differences in the native musical en- 
dowment of the students who decide to enter upon courses leading to 
licenses for teaching and supervising music in the public schools, it is 
very important for those in charge of such courses to know how to select 
those students who are most likely to succeed and to eliminate those who 
are almost sure to fail. This selection and elimination should not be at- 
tempted except upon the basis of the carefully evaluated results of 
musical talent tests. Many teacher-training institutions are now turn- 
ing out more music supervisors than are assimilated in the teaching 
positions. It would be far better if fewer and better supervisors were 
prepared in these institutions. This can be accomplished by admitting 
into courses leading to licenses as music supervisors only those students 
who have high musical talent. It is likely that the time has now come 
when this can be done and the training institutions still meet the demand 
for teachers that is made upon them. 

The purpose of the present study is to determine whether the Sea- 
shore Musical Talent Tests have sufficient predictive value to justify 
their use in music departments of state teachers colleges as a means of 
deciding which students should be permitted to enter courses intended for 
the training of music supervisors and teachers. In order that the de- 
partments of music in the public schools shall finally occupy a place of 
equal standing with the other departments, this selection of students who 
desire to enter the music supervisor’s profession is becoming more im- 
perative each year. As long as so many students in courses intended 
for the training of music supervisors have only mediocre musical talent, 
such courses are sure to be pitched on too low a level to produce the 
best results even with the more talented students. : 


Value of Knowing Accuracy and Reliability of Tests 


If the musical talent tests are accurate and reliable enough to be 
used as a means of selecting these students, they should be applied early 


(5) 


a 


6 BULLETIN OF THE SCHOOL OF EDUCATION 


in the freshman year. Obviously the state should not be expected to 
spend money upon the musical training of students who are so poorly 
endowed with musical talent as to make it doubtful whether they are 
capable of improving enough thru training to be converted into accept- 
able music supervisors. If the relation between the results of the musical 
talent tests and performance in the field of music is known, it will 
simplify the matter of selecting the students who shall enter the de- 
partment of music in the teachers college for the purpose of becoming 
supervisors of music. For this reason this study aims to throw some 
light upon the dependability of musical talent tests, and to point out 
some ways in which they may be used as selective agencies. Predic- 
tive tests are for specific situations, therefore this study is limited to 
Indiana State Teachers College. 


Previous Studies 


Several studies have been conducted for the purpose of comparing 
native musical capacity and achievement. No attempt can be made 
in this brief report to evaluate these studies. It might be said, however, 
that these studies have, for the most part, been conducted for three 
purposes: first, to assist in classifying students who are entering music 
courses; second, to assist in selecting instrumental students who can 
succeed well enough finally to become members of school orchestras; and 
third, to predict the success of students entering music schools. 

In these studies all of the correlations between musical talent scores 
and grades in ear training and sight singing have been made between 
the separate traits of such talent and the grades. That is, correlations 
have been made between pitch test scores and grades in ear training and 
sight singing, and between each of the other traits of musical talent 
and such grades. This does not produce a single figure which represents 
the correlation between musical talent and grades in ear training and 
sight singing, but does produce five or six separate figures. These figures 
cannot be averaged because the traits that make up musical talent are 
not of equal value. 

The correlations that have been made in these studies between mu- 
sical talent and grades in ear training and sight singing have involved 
the entire group only. No attempts have been made to compare such cor- 
relations of highly talented students with those of less talented ones. 

No definite performance ability level on the music talent tests and 
the intelligence tests, up to which a student should measure if he is to be 
permitted to enter upon music courses, has been arrived at in these 
studies. This has been impractical because no single figure has been 
arrived at that represents the musical talent of the student. 

There is, therefore, need for such a study which will help in the se- 
lection of students who seek to enter courses offered in the teachers col- 
leges which are intended for the training of teachers and supervisors of 
music. Such a study should show what might be expected of students 
with various degrees of musical talent and intelligence. The present 
study attempts to solve some of the above-named problems, as well as 
others that have not been touched upon in previous studies. 


EDUCATIONAL MEASUREMENTS CONFERENCE 7 


Students Used in the Study 


The present study deals with 240 students in the Indiana State 
Teachers College who have entered the music supervisor’s courses during 
the last six years, as well as with 142 students who are in general courses 
in the college. 


Tests Used 


Each of these students was given a musical talent test and an intel- 
ligence test. The term grades in ear training and sight singing were 
available for the special music students. For testing the musical talent 
of these students the Seashore’ tests were used. These tests are too well 
known to need further description. The pitch test was given three times, 
once on each of three succeeding days, the highest score being recorded. 
Each of the other tests was given twice in a similar manner. Naturally 
there was some improvement with the repeating of the tests. The object 
in giving the tests more than once was to avoid any mistakes and to come 
as nearly as possible to the physiological limit. 

Only five tests were used (pitch, intensity, time, consonance, and 
memory), that of rhythm being omitted. When the accumulation of data 
was begun six years ago, the rhythm test was not available, so it could 
not consistently be added to the later tests. Seashore, in his Psychology 
of Musical Talent’, says that “the sense of rhythm is an instinctive dis- 
position to group recurrent sense impressions vividly and with precision, 
by time, or intensity, or both, in such a way as to derive pleasure and 
efficiency through the grouping.” Then it may be said that when we 
have a student’s record in intensity and time we have a fair index to 
his capacity in rhythm. The music talent tests were given to the students 
during the last weeks of the first term of their freshman year. The psy- 
chological tests were given at entrance, the American Council psycho- 
logical examination being used. 


Kwalwasser-Dykema Music Tests 


Before continuing with the report of the study based upon the Sea- 
shore tests, it should be said that during the year 1930 a new set of 
music talent tests has been placed upon the market. These tests are 
known as the Kwalwasser-Dykema music tests.’ They attempt to meas- 
ure ten traits of musical talent. Five of these are the same as those 
measured by the Seashore test, namely, pitch, intensity, time, memory, 
and rhythm. In addition to these the new tests attempt to measure tone 
quality, tonal movement, musical taste, pitch imagery, and rhythm im- 
agery. The new tests do not provide for the measurement of consonance. 
The ten tests are given by means of phonograph records. The pitch test 
consists of forty trials. A tone is sustained, either at the same pitch 
thruout the test or at a higher or lower pitch, ending on the original 

1 Seashore, Carl E. Manual of Instruction and Interpretations for Measures of Musi- 
cal Talent. New York City: Columbia Graphophone Company, 1919. 

?Seashore, Carl E. The Psychology of Musical Talent. Newark: Silver, Burdett 
and Company, 1919. p. 115. 

* Kwalwasser-Dykema. Manual of Directions for Victor Records Nos. 302, 303, 304, 
305, and 306. New York City: Carl Fischer, Inc., 1930. 


= 


8 BULLETIN OF THE SCHOOL OF EDUCATION 


pitch. The listener records “same” if the pitch has not been changed, or 
“different” if the pitch has been changed. There are thirty trials in the 
intensity test. The listener indicates whether the second tone heard is 
stronger or weaker than the first. In the time test there are twenty-five 
trials. The listener indicates whether the three tones sounded are the 
same‘or different in length. There are twenty-five trials in the tonal 
memory test. Each trial consists of a pair of melodic patterns. The lis- 
tener is to judge whether the pair consists of the same or different halves. 
The rhythm test consists of twenty-five trials. Each trial consists of 
two parts. The listener is to judge whether the parts are the same or 
different rhythmically. In the tone quality test there are thirty trials. 
Each trial consists of two parts played either by the same instrument or 
by different instruments. The listener is to judge whether the two parts 
are the same or different in quality. The tonal movement test consists of 
thirty trials of four tones each. The tones are so arranged that a fifth 
tone, if added, would naturally go either up or down from the last tone 
given. The listener is to judge whether the fifth tone should go up or 
down. There are ten trials in the melodic taste test. A melody is played 
twice. The first half is the same in each instance. The second half is 
changed in the second performance. The listener is to indicate which he 
prefers. In the pitch imagery test there are twenty-five trials. The lis- 
tener has the notation in his hand and is asked to judge whether what 
he hears is the same or different tonally. There are twenty-five trials in 
the rhythm imagery test. The listener has the notation in his hand and 
is asked to judge whether what he hears is the same or different rhythmi- 
cally. 

In order to determine the reliability of these new tests they were 
given to eighty-five special music students at the Indiana State Teach- 
ers College. These students had already taken the Seashore tests. Every 
effort was made to have these new tests given just as the Seashore tests 
were given. The pitch test was given three times and each of the others 
was given twice. The best result was recorded. Table I shows the me- 
dians, means, and standard deviations for the Seashore test percentile 
rankings and the Kwalwasser-Dykema test percentile rankings for these 
students. There is a tendency for the means and medians to be much 
higher in the Kwalwasser-Dykema test than in the Seashore test. For 
instance, the mean in the Kwalwasser-Dykema test in memory is 98.085 
and the standard deviation is only .5195. This indicates that most of the 
scores are close to 98. These students are special music students, but it 
is not likely that they are so evenly balanced in the capacity to remember 
tones as is indicated by this test. The mean of the Seashore test in mem- 
ory is 81.985, while the standard deviation is 20.15. 


EDUCATIONAL MEASUREMENTS CONFERENCE 9 


TABLE I. MEANS, MEDIANS, AND STANDARD DEVIATIONS OF 
THE PERCENTILE RANKINGS ON THE SEASHORE TEST 
AND THE KWALWASSER-DYKEMA TEST 


Number of 
Tests Mean S. D. Medium Students 
SEASHORE TEST 
81.025 21.25 88.18 79 
Ee 72.76 20.5 73.75 77 
77.95 26.15 87.5 78 
Consonance............ 84.295 19.80 90.0 78 
81.985 20.15 87.5 78 
KWALWASSER— 

DYKEMA TEST 
83.02 16.195 85.357 84 
88.555 8.65 90.3 83 

93.958 8.5 96 .665 82 
98.085 5195 9.185 83 
91.47 12.89 94.078 83 
94.585 10.05 95.892 83 
Tonal Movement....... 97 .03 12.89 99.295 83 
Melodic Taste......... 92.975 12.728 94.44 84 
Pitch Imagery......... 98 .68 6.285 99.72 82 
Rhythm Imagery....... 93.27 7.865 95.64 82 


Any test in which all make so nearly the highest possible score can- 
not be very dependable. Judged by this criterion none of the Kwalwasser- 
Dykema tests, with the possible exception of pitch and intensity, would 
be very reliable. However, the fact that the means are not so high and 
the standard deviations not so small in the pitch and intensity scores 
does not indicate that these tests are reliable. Further study of these 
tests is necessary before they can be pronounced reliable. 

Between two tests which are constructed for measuring the same 
traits there should be a very high correlation. There are five of the 
Kwalwasser-Dykema battery of tests and the Seashore battery of tests 
which presumably measure the same traits. These are tests for pitch, 
intensity, time, tonal memory, and rhythm. Table II shows the results of 
correlations between the scores on these tests for the Seashore battery 
and the Kwalwasser-Dykema battery. 


TABLE II. CORRELATION OF SCORES ON THE SEASHORE 
TEST AND THE KWALWASSER-DYKEMA TEST 


Test *(S)(KD) + PE, | Number of students 
.324 = .068 78 
.228 + .070 83 


5 
Fs 
| 


10 BULLETIN OF THE SCHOOL OF EDUCATION 


None of these correlations are as high as they should be if both tests 
were good measures of the traits in question. Between the pitch tests of 
the two batteries the correlation is only .1445. This indicates that there 
is only a slight tendency for the scores to be high in one test when they 
are high in the other. The correlation between the intensity tests is 
—.1195. This indicates that there is a tendency for the scores to be low 
in one test when they are high in the other. The highest correlation, .401, 
is between the time tests, but this is not high enough to be significant. 
It seems safe to say, on the basis of these correlations, that the two tests 
do not measure with equal accuracy. If they did, the correlations would 
be very much higher. 

The scores for each of these five tests for both the Seashore battery 
and the Kwalwasser-Dykema battery were correlated with the term 
grades in ear training and sight singing for the same students. Table III 
shows the results of these correlations. 


TABLE III. CORRELATION OF MUSIC TALENT SCORES ON THE 
SEASHORE TEST AND THE KWALWASSER-DYKEMA TEST 
AND TERM GRADES IN EAR TRAINING AND SIGHT 
SINGING 


Seashore test Kwalwasser-Dykema test 


Test Number of Number of 
™(M)(G) PE,| students | '(M)(G) + PE,| students 


+ .0677 79 —.118 += .071 84 
Intensity........ 2967 = .0695 77 —.167 + .072 83 

3628 = .0656 80 —.0299 + .074 ‘82 
Daomory......... 5631 + .052 79 40 + .062 83 
Rhythm......... 209 += .071 84 .1908 + .071 83 


In the Seashore battery all of the correlations are about the same as 
that usually obtained between scores in natural capacity tests and 
achievement. In the Kwalwasser-Dykema battery three of the correla- 
tions are negative and only one, that of memory, is high enough to be 
taken into account. In every case the Seashore test scores correlate 
higher with term grades than do the Kwalwasser-Dykema test scores. 

As for the remaining five tests in the Kwalwasser-Dykema battery, 
it might be a question whether any one of them is entirely a test of na- 
tive capacity. One’s ability to judge in the matter of tone quality, me- 
lodic taste, and tonal movement (resolution) certainly comes with train- 
ing. The test in pitch imagery, where the student looks at the score and 
determines whether what is being heard is the same as represented by 
the notes in his score, is a reading test. The same is true of the rhythm 
imagery test. Table IV shows the results of correlations between the 
scores on these tests and term grades in ear training and sight singing. 


EDUCATIONAL MEASUREMENTS CONFERENCE 11 


TABLE IV. CORRELATION OF MUSIC TALENT SCORES ON THE 
KWALWASSER-DYKEMA TEST AND TERM GRADES IN EAR 
TRAINING AND SIGHT SINGING 


Test ™(M)(G) + PE, Number of students 
Teme .2063 = .071 83 
TOMS .2477 = .069 83 
.191 + .071 82 
.3943 = .063 82 


It would seem that these correlations should be higher since the tests 
measure, to some extent, the student’s training. Such, however, is not 
the case. Tone quality and ear training correlate .2063; tonal movement, 
.2477; melodic taste, .1896; pitch imagery, .191, and rhythm imagery, 
.3943. The consonance test of the Seashore battery correlates with ear 
training .326. 

A few statements should be made regarding the manner, given in the 
manual, of computing the percentile ranks in these new tests. One test 
has forty trials, three have thirty trials each, five have twenty-five trials 
each, and one has ten trials. In computing the norms for the several 
tests the number of right answers is considered regardless of the number 
of trials. The per cent of right answers is not considered. Standards are 
obtained for the entire battery of tests by adding together the number of 
right answers in all tests and computing norms for the entire battery. 
The present writer questions these procedures because the various tests 
are not of equal value and therefore should not be combined by addition, 
and because of the unequal number of trials in the various tests. When 
norms are made for the various tests it would seem that they should be 
based upon the per cent right rather than upon the number right, since 
the number of trials is not equal for all tests. 

The authors state in the manual that the norms are based upon 
scores earned by two thousand grade and high school pupils. Norms are 
not given for children of various ages. One set of norms is given for 
everybody. This may account for some of the inconsistencies found in 
the present study of these tests. In the present study the tests were 
given to college students, while the norms used were based upon scores 
earned by grade and high school pupils. This would seem to be a weak- 
ness that needs correcting. 

It would seem fair to draw the following conclusions from the above 
data and their treatment: First, because the means are so high and the 
standard deviations are so small in most of the tests it would seem that 
the tests do not measure the varying capacities of the students. Second, 
since the correlations between the scores on the Seashore tests and those 
on the Kwalwasser-Dykema tests constructed to measure the same traits 
are so low it is evident that one of the tests does not measure accurately. 
Third, when the scores on the component parts of the Seashore tests are 
correlated with grades in ear training and sight singing the result is 


3—47782 


= 
| 
| | 


12 BULLETIN OF THE SCHOOL OF EDUCATION 


about the same as is usually obtained when native capacity scores are 
correlated with achievement. When the scores on these same component 
parts of the Kwalwasser-Dykema tests are correlated with grades in ear 
training and sight singing, three of the correlations are negative and the 
other two are lower than those of the Seashore tests which measure the 
same traits. 

It would therefore seem fair to conclude that the Kwalwasser- 
Dykema music tests do not measure as reliably as do the Seashore tests. 
The Kwalwasser-Dykema tests are more interesting to administer than 
the Seashore tests. They are much shorter and therefore would not tire 
the children in the lower grades so much. They are also much easier to 
score. 


Relation between Music Talent of Special Music Students and Students 
in Non-Music Courses as Measured by the Seashore Tests 


In order to find whether the 240 special music students who were 
being investigated were students with more than the average native 
musical capacity, they were compared with the 142 students who were in 
courses other than music. Table V shows the results of this comparison. 


TABLE V. COMPARISON OF MUSIC TALENT OF SPECIAL 
MUSIC STUDENTS AND STUDENTS ON GENERAL COURSES 


Number of 
Test M + PE,, 8. D. Median students 
SPECIAL MUSIC 
STUDENTS 
nar 83.2+ .79 18.4 90.6 241 
Intensity.......... 67.8 = 1.08 24.8 72.7 238 
72.6 = 1.23 28.1 81.8 239 
Consonance........ 77.3 = 1.16 26.7 85.1 239 
80.4+ .85 19.5 85.4 234 
NON-MUSIC 
STUDENTS 
40.9 + 2.33 39.9 31.7 134 
Intensity.......... 50.2 + 1.32 23. 50.3 139 
Consonance........ 57.9 = 1.75 31 58.7 143 
5 Aree 55.3 = 1.64 29 63.1 142 


The scores are given in the percentile norms as given in the Sea- 
shore manual. The medians of the special music students in the five 
traits were an average of 31 points higher than those of the students in 
other courses. Fifty is supposed to be the average for the whole popu- 
lation, and these non-music students hover around that figure except in 
pitch. Here they fall below. The special music students made an average 
of 32 points higher than the average for the whole population. 


EDUCATIONAL MEASUREMENTS CONFERENCE 13 


Relation between Intelligence Scores of Special Music Students and 
Students in Non-Music Courses 


In order to find whether these special music students who were being 
investigated were as intelligent on the average as the entire student 
body, their percentile rankings on the psychological tests were compared 
with those of the entire student body. Table VI shows the results of this 
comparison. 


TABLE VI. MEAN AND MEDIAN PERCENTILE RATINGS OF 
THE STUDENTS IN THE INDIANA STATE TEACHERS COL- 
LEGE ON THE BASIS OF THEIR PERFORMANCE ON THE 
PSYCHOLOGICAL TESTS GIVEN AT ENTRANCE. FALL 
QUARTER, 1929 


Number of 
Median Mean students 

Total Regular College................ 68.38 61.8 399 
Total 44.04 45.4 304 
Freshmen on Special Music Course 

during six-year period............... 56.31 57.37 228 
Total College Freshmen..............} 62 149 
Total Special Freshmen...............} 47 255 
Total Elementary Freshmen..........) ..... 41 163 


These tests are given to all students during the first term of their 
freshman year. The mean of the special music group is 57.37 against 
49.53 for all the other freshmen. The mean of the special music group is 
higher than that of any of the other groups with which it is fair to com- 
pare it except the group of regular college freshmen. We have here, 
then, a group of students far superior in music talent to the average 
student in school, superior in their performance on the psychological 
tests to the entire school, and only slightly inferior on these tests to the 
regular college group. It is interesting to note that the performance on 
the psychological tests of this group of special music students is much 
higher than that of all the students in special subjects such as art, do- 
mestic science, commerce, physical education, industrial arts, music, etc. 
The total school group is made up of freshmen, sophomores, juniors, and 
seniors. The upper three grades of this group are, of course, a more or 
less select group, the inferior students having had time to drop out. The 
group of special music students does not have this advantage, and with 
this handicap the mean of this group is five points higher than the total 
school. 


Relation between Music Talent and Term Grades in Ear Training and 
Sight Singing 


It may be argued that, if the teaching of such subjects as ear train- 
ing and sight singing were pitched on the right level for the entire class 


2 
= 
| 


14 BULLETIN OF THE SCHOOL OF EDUCATION 


and the system of giving term grades were accurate and reliable, there 
should be a rather high correlation between the music talent of the spe- 
cial music students and their term grades in these subjects. If such cor- 
relation should prove to be high, it would no doubt indicate that the test 
used to measure the music talent had considerable predictive value. If 
the correlation should prove to be low, it might be due to poor teaching, 
poor student effort, a poor grading system, or poor testing for music 
talent. 

The matter of making this correlation was not as simple as it would 
seem at first thought. In the first place the scores in music talent were 
in pitch, intensity, time, consonance, and memory. One could make a cor- 
relation between the term grades and each of these traits, but the result 
would be in five separate figures of unequal value. It is not permissible 
to add the raw scores of these five traits and divide by five because of 
the unequal value assigned to each of the traits. One might weight the 
traits and average them, but such procedure would, while mathematically 
accurate, be without any guaranteed scientific foundation. There is, how- 
ever, a technique by which scores on different tests may be combined. 
Holzinger’, in his Statistical Methods in Education, says that, being ab- 
stract numbers, standard scores on several tests may be combined by 
addition. The standard score of a student gives his relative position in 
the group in terms of a number of standard deviations above or below 
the mean. The standard score is found by subtracting the mean from the 
raw score and dividing by the standard deviation. The result will be a 
number expressing the number of standard deviations either above or 
below the mean. This number will be plus if it is above the mean or 
minus if below. 

There were 128 students for whom music talent scores were avail- 
able in each of the traits, and for whom three to four term grades in ear 
training and sight singing were available. The standard score in pitch 
for each of these students was found. Then the standard scores for each 
student were found in intensity, time, consonance, and memory. The com- 
posite standard score for each student was then found by adding his five 
standard scores together. This made it possible to express the music tal- 
ent of each student with a single figure. A correlation was then made 
for the 128 students between composite standard scores in music talent 
and average term grades in ear training and sight singing. The Product 
Moment Formula was used in making this correlation as well as all cor- 
relations in the present study. The result of the correlation was .399. 
This seems low, considering the fact that the correlation was made be- 
tween music talent and term grades in those subjects which certainly 
depend so largely upon this talent. However, it is about the same as is 
usually obtained when native capacity scores are correlated with achieve- 
ment. 

Mr. Wilson’, at Ohio University, found a correlation between pitch 
and grades in ear training of only .05. In intensity he found it to be .36; 

*Holzinger, Karl John. Statistical Methods for Students in Education. Boston: 
Ginn and Company, 1928. 


®° Wilson, E. Emmett. “The Prognostic Value for Music Success of Several Types 
of Tests." Musie Supervisor’s Journal. Vol. 16, No. 1, 1930. 


EDUCATIONAL MEASUREMENTS CONFERENCE 15 


in time, .31; in consonance, .28; and in memory, .27. Hull’, in his Apti- 
tude Testing, gives tables showing the relation of the correlation coeffi- 
cient (r) to the per cent of forecasting efficiency. He says that “the 
most striking point about this table is the remarkably small forecasting 
efficiencies corresponding to R values below .50.” He shows that when 
the correlation coefficient is .40 it carries only an eight per cent fore- 
casting efficiency. It seems, therefore, from these correlations that the 
predictive value of these tests is not as great as we have usually thought 
it to be. However, there are other factors to be taken into consideration. 


Relation between Intelligence and Term Grades in Ear Training and 
Sight Singing 


Does intelligence perhaps have as much to do with grades, even in 
ear training and sight singing, as music talent? A correlation was made 
between the percentile ratings of these students on the basis of their 
performance on the psychological test and the same average grades in 
ear training and sight singing. The result was a correlation of .3599, or 
practically the same as for music talent and term grades. It is apparent 
that intelligence has as great afi effect upon achievement in ear training 
and sight singing as does music talent, altho neither is very potent. 

It seems reasonable to suppose that, if each student were doing any- 
thing like his best, there would be more correlation between the results, 
measured objectively, and his native capacity. If it should develop that 
there is higher correlation between the term grades of the least talented 
students and their native capacity, we should have to conclude that the 
fault lies in the fact that our instruction does not sufficiently challenge 
the more talented students. 


Relation between the Music Talent Scores and the Grades in Ear Training 
and Sight Singing of the Talented Students Compared 
with Those of the Less Talented Ones 


The students were divided into four groups according to their talent 
scores, Group I being the least talented group and Group IV the most 
talented. The music talent scores for the students in each group were 
correlated with the term grades. Table VII shows the results of this 
correlation. 


TABLE VII. CORRELATION OF MUSIC TALENT AND TERM 
GRADES IN EAR TRAINING AND SIGHT SINGING 


Group ™(M)(G) + PE, | Number of students 
.3875 = .1009 35 


* Hull, Clark L. Aptitude Testing. New York: World Book Company, 1928. pp. 
273-5. 


a 
q 
a 
=: 


16 BULLETIN OF THE SCHOOL OF EDUCATION 


The correlation for Group I (least talented) was .375, for Groups 
II and III the correlation was slightly lower, and for Group IV (the most 
talented) it was —.058. 

Similar correlations between intelligence scores and term grades for 
these groups showed an opposite result. That is, for the least talented 
group the correlation was .004, while for the most talented group it was 
.5496. Table VIII shows the results of these correlations. 


TABLE VIII. CORRELATION OF PSYCHOLOGICAL RATINGS 
AND AVERAGE TERM GRADES IN EAR TRAINING AND 
SIGHT SINGING 


Group ™(G)(P) + PE, Number of students 
.5496 + .0981 23 
.3399*= .0519 128 


Similar correlations between intelligence scores and music talent 
scores for these groups were made. Table IX shows the results of these 
correlations. Table X shows the means and medians of the psychological 
ratings for the same groups. From this table it would seem that the 
higher the music talent of the student the higher will be his intelligence. 


TABLE IX. CORRELATION OF PSYCHOLOGICAL RATINGS AND 
MUSIC TALENT 


Group ™(M)(P) + PE, | Number of students 


TABLE X. MEANS AND MEDIANS OF THE PSYCHOLOGICAL 


RATINGS 
Group Mean Median | Number of students 


EDUCATIONAL MEASUREMENTS CONFERENCE 17 


From these data it is evident that students with very low music tal- 
ent are not able to make progress in ear training and sight singing in 
keeping with their psychological ratings. Their progress seems influenced 
more by what little talent they have than by their rating on the psycho- 
logical test. The progress of the most talented group, however, is in- 
fluenced almost altogether by their psychological ratings. This would 
seem to show that in order for the psychological ratings to function in 
musical learning there must be high musical talent, but that the prog- 
ress of students above the median on the musical talent scores will be 
more in proportion to their psychological ratings than to their musical 
talent scores. That is, intelligence (if one may use the term) cannot be 
substituted for musical talent, but given a high degree of musical talent, 
the progress in musical learning is in proportion to the intelligence. 
Given a low degree of musical talent the progress is in proportion to the 
talent. 

The data presented above seem to justify the following conclusions: 
(1) there is a much higher correlation between the psychological ratings 
and term grades in ear training and sight singing of the talented stu- 
dents than of the less talented ones; (2) there is a higher correlation 
between the psychological ratings and the music talent of the talented 
students than of the less talented ones; and (3) the mean of the psycho- 
logical ratings is much higher for the talented students than for those 
with less talent. 

Table XI shows the percentages of students with music talent below 
the median and below the first quartile who have grades in ear training 
and sight singing at various levels. 


TABLE XI. PERCENTAGES OF STUDENTS WITH MUSIC TAL- 
ENT BELOW THE MEDIAN AND BELOW QUARTILE ONE 
WHO MADE GRADES IN EAR TRAINING AND SIGHT SING- 
ING AT VARIOUS LEVELS 


Per cent Per cent Per cent 
Music talent below Q; above Q; | belowmedian| Number of 
in grades . in grades in grades students 
Below Q:........ 34.5 065 69.5 41 
Below median.... 64.5 79 


Table XII shows the percentages of students at various levels of 
music talent and psychological ratings who made grades in ear training 
and sight singing below the median. 


= 
| 
a 
a 


18 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE XII. PERCENTAGES OF STUDENTS AT VARIOUS LEV- 
ELS OF MUSIC TALENT AND PSYCHOLOGICAL RATINGS 
WHO MADE GRADES IN EAR TRAINING AND SIGHT SING- 
ING BELOW THE MEDIAN 


Per cent 
Music talent Psychological rating | below median| Number of 

in grades students 
Below median......... Above median......... 54.7 33 
Above median........ Below median......... 76.2 21 
Below median.........| Below median......... 71.5 44 
Above median........ Above median......... 28.6 28 
Above median......... 57.9 19 
err Below median......... 81.5 27 


The data presented above seem to substantiate the conclusions 
reached by the correlations between music talent and average term 
grades in ear training and sight singing and between psychological rat- 
ings and such term grades. That is, general mental powers affect term 
grades in ear training and sight singing to about the same extent that 
musical talent affects them. 


Conclusions and Interpretations 


The analysis of the data presented in the preceding pages leads to 
the following conclusions: 

1. The individual differences in the native musical endowment of 
students entering upon courses intended for the training of music super- 
visors and teachers are very great. The highest talent is represented by 
the composite standard score +5.022, while the lowest talent is repre- 
sented by the composite score —12.533. In some of the traits the range 
is all the way from 1 to 100 in raw scores. This certainly points to the 
necessity of eliminating those students with mediocre talent. 

2. The medians of the various music talent scores of the special 
music students average 32 points higher than those of the students in 
the non-music courses. 

3. The mean of the psychological ratings of the freshman special 
music students is 7.84 points higher than that of the freshmen in the 
entire school. 

4. The correlation between music talent scores and term grades in 
ear training and sight singing is .399 for the entire group of music stu- 
dents. The correlations are much lower for the talented students than 
for those with less talent. 

5. The correlation between psychological ratings and term grades 
in ear training and sight singing is .8399 for the entire group. The cor- 
relations are much higher for the talented students than for those with 
less talent. 


| 


EDUCATIONAL MEASUREMENTS CONFERENCE 19 


6. The data seem to show that the higher the musical talent of a 
group of students the higher will be their psychological ratings. There is 
a difference of 16.43 points between the means of the psychological rat- 
ings for the students in the upper quartile of music talent and those in 
the lower quartile. 

7. Of the students whose musical talent scores fall in the lowest 
quartile, 34.5 per cent make grades in the lowest quartile while 69.5 per 
cent with such scores make grades below the median, and only .065 per 
cent make grades in the upper quartile. Of students with talent scores 
below the median, 64.5 per cent make grades below the median. 

8. Of the students with musical talent scores below the median and 
psychological ratings above the median, 54.7 per cent make grades below 
the median. When this is reversed, 76.2 per cent make grades below the 
median. When both musical talent scores and psychological ratings are 
below the median, 71.5 per cent make grades below the median; and 
when both are above the median only 28.6 per cent make grades below 
the median. 

9. When both musical talent scores and psychological ratings are 
in the lowest quartile, 90 per cent make grades below the median. When 
musical talent is in the lowest quartile and psychological ratings are 
below the median, 81.5 per cent make grades below the median. 


Interpretation of Findings 


The purpose of the study is to determine whether the musical talent 
tests have suflicient predictive value to justify their use in deciding which 
students should be permitted to enter courses intended for the training 
of music supervisors and teachers. It can be predicted that 90 per cent 
of students with talent scores and psychological ratings in the lowest 
quartile will make grades in ear training and sight singing below the 
median, and that practically none of these students will make grades in 
the upper quartile. It would seem safe to eliminate all of these students 
from the music supervisor’s course. It can be predicted that 81.5 per 
cent of the students with musical talent scores in the lowest quartile and 
psychological ratings below the median will make grades below the 
median. It is not likely that any of these students could succeed as music 
supervisors even if a few of them could make enough good grades to be 
permitted to remain in school until they finish the course. Practically 
three-fourths of the students with talent scores and psychological rat- 
ings above the median will make grades above the median. As far as 
native equipment is concerned these students could become successful 
music teachers or supervisors. Music talent tests and intelligence tests, 
however, cannot predict success because something else is required for 
success besides native capacity. These tests can predict failure. 


Practical Application of the Findings for Vocational Guidance 


These talent tests have been used by the writer for several years, in 
a more or less haphazard way, for the-purpose of admitting students into 
the department of music with the intention of becoming supervisors of 
music. Even in the more or less unscientific way in which they have 


4—47732 


7 
*3 
he 
- 
> 


20 BULLETIN OF THE SCHOOL OF EDUCATION 


been used it has been felt that not many mistakes have been made. It 
will now be possible to set a standard in musical talent and intelligence 
below which success will be practically impossible. At the end of the 
first term’s work of the freshman year the student should be given a 
thoro test for musical talent and intelligence. If his scores in the two 
traits fall in the lowest quartile it would seem wise to suggest that he 
enroll in some other course than the music course. If his score in music 
talent falls in the lowest quartile, even if his intelligence score is below 
the median, it would probably be wise to suggest another course than the 
music course for him. It might be wise to eliminate all students whose 
scores fall in the lowest quartile regardless of their intelligence scores. 
Since these tests are to be given at the end of the first term, a check can 
be made with the student’s term grades in ear training and sight sing- 
ing. If there is any conflict between the term grades and the talent and 
intelligence scores, the student might be given another term in which to 
show his ability. 

It is recognized that this study and the findings herewith discussed 
are based on the standard of work done at the Indiana State Teachers 
College, and there would naturally be a variation in the application of 
the procedure in different institutions. However, if each institution that 
wishes to use the same procedure as a basis of admission will work out 
the medians and quartiles from the composite standard scores of its own 
students, the tests should serve as good selective agents. 


A Study in Methods of College Teaching 


A. R. EIKENBERRY, Professor of Education, Manchester College, 
North Manchester 


Procedure 


Enrollment and Equipment. For the past several years there had 
been in Manchester College three sections of college students in the in- 
troductory psychology class. There were on the average about 120 stu- 
dents in the class. Plans were made in anticipation of similar conditions 
prior to the opening of school in the fall of 1929 and, in consultation with 
the Dean of the college, arrangements were made for enrollment. These 
plans were definitely mapped out at the beginning of the year, but were 
not rigidly adhered to. It was intended from the beginning that changes 
be made during the progress of the experiment if such changes seemed 
advisable. Realizing that as soon as plans become rigid, procedure be- 
comes formalized, it was thought best to provide for modifications. Most 
of the changes were, however, of little significance and could not, in any 
large way, modify the results. 

Enrolling officers were instructed to enroll students in the sections 
which they preferred, except that only those who had open periods at 
8:30 or at 3 o’clock were permitted to enroll in the experimental section 
which met at 7:30 in the morning. The other two sections met at 11 and 
1 o’clock. An effort was made to get as nearly the same number of stu- 
dents in each class as possible. Fortunately this worked out very satis- 
factorily. When enrollment was completed there were 39 in the 7:30 and 
in the 11 o’clock sections and 43 in the 1 o’clock section. The group 
meeting at 7:30 was the experimental group and the other two were 
control groups. Credit for such a satisfactory “experimental set-up” is 
largely due to the splendid codperation of the Dean and other officers in 
the institution. ; 

Recitation periods are fifty minutes in length. Classes meet four 
times a week and ordinarily receive four hours credit per term. Partly 
because of the state requirements for teachers, and partly because of 
conviction that one term is insufficient to cover the fundamentals of 
psychology, two whole terms are devoted to the general course. The first 
eighteen weeks were used for the general field of psychology and during 
the last six weeks problems and viewpoints were emphasized. This ex- 
periment covers only the first eighteen weeks. Students were urged to 
enroll for both terms during the same year and most of them did so. 
However, because of conflicts there was some shift in the sections at the 
beginning of the second term. 

Woodworth’s latest book was used as a text and was in the hands of 
all of the students. This was supplemented with the major portion of 
Dashiell’s Fundamentals of Objective Psychology and with a consider- 
able portion of Robinson and Robinson’s Readings in General Psychology. 


(21) 


4 
° 


22 BULLETIN OF THE SCHOOL OF EDUCATION 


Six of the former and ten of the latter were available. Scattered refer- 
ences were taken from many other books. Suggested readings were ap- 
proximately the same in all sections. 


Conduct of Class Period. The method of teaching and the conduct of 
the class period in the control sections were of the ordinary conventional 
type. The material was handled in terms of units and the class period 
was used for lecture-discussion-recitation procedure. Assignments were 
usually made several days in advance. Students were encouraged, but 
not required, to take notes. There was no check on the amount of out- 
side reading done except the scattered examination questions that were 
taken from some of this material. 

For the experimental section the procedure was as follows: Sets of 
problems and questions were made out for most of these topics and 
mimeographed copies were given to the students. Written reports on 
these problems were handed in before the topics were discussed in class. 
Frequently, however, the topic was introduced and problems suggested 
before the reports were completed. These reports were checked and re- 
turned to the student. Students were encouraged to study the topic as a 
whole and not to be satisfied with the mere answers to the questions. 
During the regular class hour these problems were discussed, reports 
were given, and some final summaries were presented. Some additional 
material was given in lecture form. An effort was made to dispense with 
the formality of the usual recitation. 

Since these students were enrolled so as to have an open period 
either at 8:30 or at 3 o’clock, they were encouraged to spend one of 
these hours in the study room. Considerable liberty was granted for stu- 
dents to come and go as they pleased. An effort was made to make this 
room the most profitable place to study. Approximately two hundred 
library books were kept in this room for the students’ use. All under- 
stood that if they had something more urgent they might leave at any 
time. Often students remained during both of these hours in order to 
complete their work. 

During this hour they usually worked on the problems and questions. 
Sometimes they worked alone and sometimes in groups. Group activity 
and discussion were encouraged. During these periods the instructor was 
jin the room most of the time, but came and went quite frequently. Part 
of the time he spent at his desk at his own work. Students were free at 
any time to come to the desk for help on any problem. Help was given by 
either answering directly if no ready reference was at hand or, prefer- 
ably, by suggesting reading material that was pertinent to the problem. 

Sometimes there were informal discussions, and suggestions were 
given to the entire group that was in the room. Sometimes questions 
were brought up concerning the problems discussed during the previous 
class hour. The instructor moved about in the room and asked questions 
attempting to stimulate and help those individuals least likely to ask for 
suggestions. 


Extra Reading and Credit. In order to induce this experimental 
group to do some outside reading besides that which was of a more tech- 
nical nature, the students were asked to read some books or articles in 


EDUCATIONAL MEASUREMENTS CONFERENCE 23 


addition to those mentioned above. They were frequently encouraged to 
select this material themselves on the basis of interest. At the beginning 
of the term they hesitated and seemed at a loss to know how to do this, 
and as a result it was necessary to offer some suggestions. Later on, 
however, guidance was about all that was necessary. This material dif- 
fered somewhat from the outside reading material in that it was written 
for popular reading. Tho some of this work was required, there was 
much freedom in the selection, and interest was evident. 

Written reports or summaries on this work were handed in. The 
work was subjectively evaluated in terms of points by the instructor on 
the basis of volume of material, difficulty, and apparent grasp as indi- 
cated by the report. One student read Herrick’s Introduction to Neu- 
rology and received credit for 70 points. Several read Keller’s The 
World I Live In, and received from 8 to 12 points. For Myerson’s The 
Nervous Housewife from 16 to 20 points of credit were given. 

In order to provide for individual differences in ability and en- 
courage those who were interested in the subject to do more work, a 
sliding scale of credit was offered varying from three to five hours per 
term. If any student preferred to do only a limited amount of work or if 
the work was so difficult that the student thought he could not do satis- 
factorily as much as the average student, he was given permission to 
work for only three hours of credit. Such students were not required to 
attend the study hour period nor to hand in reports of additional read- 
ing. They were required to attend all regular class hours and to pass all 
of the regular examinations. 

In order to secure four hours of credit per term students were ex- 
pected to attend the study hour period and to hand in written reports 
of outside reading sufficient for a credit of 20 points per term or 40 
points for the two terms. It was intended that the average student 
should do this amount of work. 

Five hours of credit were offered to capable students providing they 
met the above requirements and in addition secured a total credit of 90 
points for outside reading. These students were also required to take 
a short oral quiz on their reading. Their written reports were carefully 
checked and it is believed that a greater amount of work was done for 
this hour of credit than for one hour of the regular work. 


Results 


Extra Reading Not Required. Many times students read only parts 
of books. This was especially true when they were reading in textbooks. 
If the book was a unit they were urged to complete it, but if not they 
were told to stop reading when they lost interest. Table I shows the 
titles of books or subjects together with the author of each work. The 
second column of figures shows the number of pupils who read material 
in each of these books and the third column gives the total number of 
pages read in each book. 

The range of the number of pages read for those who remained in 
the section during the entire eighteen-week period was from 264 to 1,424. 
The three students reporting on the largest number of pages received 


i 
ris 


24 


extra credit as described above. 


BULLETIN OF THE SCHOOL OF EDUCATION 


The one who read the largest amount 


received ten hours of credit for the two quarters. The other two received 
nine hours of credit. One student worked for only three hours of credit 
and therefore reported no reading. The average number of pages read 


was 528. 


TABLE I. AMOUNT OF OUTSIDE READINGS REPORTED 


Number of 


Number of 


Author Title students | pages read 
Psychology of Personality. ..... 2 454 
Practice of Auto-Suggestion..... 5 600 
Carr...... Neural Basis of Intelligence... .. 3 50 
epee Psychology of Religion......... 1 200 
Commonwealth 

Fund Committee} Three Problem Children........ 5 630 
4 165 
Freeman......... Intelligence Testing............ 1 47 
re Psychology of Insanity......... 5 860 
EP Mental Conflict and Misconduct. 1 100 
Herrick..........| An Introduction to Neurology... 1 25 
Herrick..........| Brains of Rats and Men......... 10 1,040 
Reluctantly Told ............... 3 615 
Hollingworth. ...| Judging Human Character...... 2 450 
a The World I Live In............ 19 3,700 
Knowlson........| Business Psychology............ 2 430 
eee Mentality of Apes.............. 2 660 
Ladd and 

Woodworth..... Physiological Psychology....... 7 210 
Myerson......... The Nervous Housewife......... 4 1,068 
Norsworthy and 

Whitley........ Psychology of Childhood....... 1 50 
Overstreet....... Influencing Human Behavior... . 4 1,168 
Poffenberger..... Psychology of Advertising. ..... 2 1,400 
(eee The Infant Mind............... 1 120 
Robinson and 

Robinson. ...... Readings in General Psychology. 5 660 
Rosanoff......... Manual of Psychiatry........... 1 145 
Salisbury and 

Jackson.........| Outwitting our Nerves.......... 3 604 
Sandiford........ Educational Psychology........ 7 324 
Psychology of Adolescence... ... 2 190 
Wateom.......... Psychology from Standpoint of 

eee 5 230 
Woodworth...... Dynamic Psychology........... 1 108 


Objective Tests. Six different objective tests were given to the en- 


tire class. 


The first four tests were devised by the instructor and con- 


608 


EDUCATIONAL MEASUREMENTS CONFERENCE 25 


sisted mainly of true and false statements. The method of scoring varied 
somewhat, but in every case the same system was used for a given test in 
all of the different sections. In some cases the scores were converted to 
a percentage basis while in others the raw score was reported. The two 
other tests were Part I and Part II of May’s test on Woodworth’s text. 
In these tests the true and false statements were scored by subtracting 
the wrongs from the rights. Other questions were scored on the basis of 
the number right. The time was carefully checked and each group was 
given equal opportunity in taking the tests. 

Comparisons of scores were made on two bases. First, the entire 
experimental section which met at 7:30 o’clock was compared with each 
of the other two sections. Since the enrollment was largely a matter of 
chance this was considered a fair procedure. Second, 26 students who re- 
mained in the experimental section thru the entire period were paired on 
the basis of intelligence with individuals selected from the other two sec- 
tions. These intelligence ratings were based on the average scores made 
on two forms of the Otis Self-Administering Tests of Mental Ability, 
Higher Examination. These tests were given by the college the preceding 
year as a part of the entrance examinations. Exact duplicates of scores 
were paired as far as possible and in no case was the variation more 
than two points. 


Table II shows the median scores of all groups on the six different 
tests. 


TABLE II. MEDIAN SCORES OF ALL GROUPS 


Median score on test number 


1 2 3 4 5 6 


Group 


ENTIRE CLASS— 


7:30 Section (Experimental)... .... 79 76 83 61 53 41 

One o’clock Section. .............. 71 73 80 59 43 37 
PAIRED GROUPS— 

79 76 84 61 51 41 


The scores for the experimental section are higher in all cases ex- 
cept one. In Test 6 the score for the experimental section is 41 and the 
score for the control section is 42. The superiority of the experimental 
section is not great, but it is quite consistent. 

Comparison of groups in educational psychology. Sixteen of the stu- 
dents who were in the experimental section described above enrolled dur- 
ing the spring quarter in a course in educational psychology. These stu- 
dents were paired with others in the same class who had been in the con- 
trol sections during the fall and winter quarters. The pairing was on 
the same basis as indicated above. These 32 students were scattered in- 
discriminately in the three sections of Educational Psychology. Some of 
both groups were in all three sections. 


4 
a 


26 BULLETIN OF THE SCHOOL OF EDUCATION 


Five objective tests were given during the quarter and the total 
scores on all five tests were compared. Table III shows the distribution 
of these scores. The experimental group consisted of those who were in 
the experimental section in the first course. The control group consisted 
of those who were in the control sections in the first course. 


TABLE III. DISTRIBUTION OF SCORES MADE IN EDUCA- 
TIONAL PSYCHOLOGY 


Class interval Experimental group Control group 
0 1 


Student Evaluation. Students were asked to state as definitely as 
possible their opinion regarding the method and compare it with the 
customary procedure. They were told that this was wanted for scientific 
purposes and not for sentimental reasons. With the exception of one 
student, all favored this method. Several spoke of the freedom and in- 
formality of the work. Less routine and more independence were men- 
tioned as desirable characteristics of this procedure. Student opinion 
may not be reliable, but it probably reflects at least something as to 
the degree of interest. 


Conclusion 


It would be folly to assume that final conclusions could be drawn 
from such an experiment. It may be that the results indicated above 
add some weight to the accumulation of evidence now at hand. In the 
first place, the study helped to develop something of the experimental 
attitude in the instructor. Some such procedure would be of much value 
to many teachers. The extra reading was of more value than is indicated 
in the objective tests. These questions were not planned in order to test 
this reading. They were based on the classroom work. A few students 
did the minimum amount of reading, but many read much more than was 
necessary. This was done because the students were interested. The 
type of material read had much to do in developing and maintaining this 
interest. Several students expressed a desire to take work in abnormal 
psychology after reading some of these books. 

In the study of the distribution curves for the scores on the objective 
tests it is observed that with one exception the median for the experi- 
mental group exceeds the median for the control groups. The difference 
in some cases is slight. The average difference between the medians will 
likely be between two and three times the probable error of the difference. 
This difference is not generally considered very significant; however, the 


EDUCATIONAL MEASUREMENTS CONFERENCE 27 


consistent superiority of the experimental section impresses one. Similar 
results from the comparison of the scores in Educational Psychology in- 
dicate the same thing. The almost unanimous opinion of the students 
favors the use of the method. The general impressions of the instructor 
as to the interest, spontaneity, and appreciation of the students together 
with the other facts stated above leads him to make the following brief 
statements: 


1. The indications are that students learned more psychology in 
the experimental section than in the control sections. 

2. Tho the difference between the groups is relatively slight, it is 
consistent and therefore significant. 

3. Any small improvement, due to slight changes in method of 
teaching, is important. 

4. A modified form of this procedure will be used again the fol- 
lowing year. 


° 
a 
4 


The State High School Testing Program---- 
Some First Results 


H. H. REMMeERs, Associate Professor of Education and Psychology, 
Purdue University 


THE data presented in this paper constitute a sampling of short- 
term’ high schools in Indiana. They chiefly reflect, therefore, the edu- 
cational situation as it exists in the rural consolidated high schools of 
the state. All tests from which these data were obtained were ad- 
ministered on the same day, April 3, approximately two weeks before 
most of these schools closed. 

The tests used in the testing program are based upon the state 
course of study as outlined in the various subject-matter bulletins of 
the State Department of Public Instruction, and, assuming that they 
adequately sample pupil achievement of this content, are valid measures 
of scholastic achievement—more valid, indeed, than would be true of tests 
sold in the open market and standardized on a national basis rather than 
on a state basis. 

Tables I and II give the data on gains in pupil achievement from 
grade to grade for the two tests in English. 


TABLE I. PUPIL ACHIEVEMENT IN ENGLISH: GRAMMAR AND 
MECHANICS OF WRITING 


Mean gain 
Grade Number of | Arithmetic | 8S. D. 

pupils mean score Raw score|_ S.D. 
1,165 40.80 20.25 
1,010 48.75 22.05 7.95 .36 
eee 697 54.70 22.80 5.95 .26 
574 59.80 22.90 5.10 .22 


TABLE II. PUPIL ACHIEVEMENT IN ENGLISH: UNDERSTAND- 
ING AND APPRECIATION OF PROSE AND POETRY 


Mean gain 
Grade Number of | Arithmetic 8.D. 

pupils mean score Raw score} §&.D. 
ace 1,030 51.90 14.65 5.65 32 
eae 951 54.80 15.35 2.90 .19 
725 58.45 15.50 3.65 24 
4,080 52.60 15.60 


1 High schools which are open for only eight or eight and one-half months during 
the year. 


(28) 


|_| 


EDUCATIONAL MEASUREMENTS CONFERENCE 29 


Average gains from one year to another, when reduced to comparable 
units (standard deviation units), are of substantially the same order of 
magnitude for both tests—from one-fifth to one-third of a standard 
deviation—the gains tending to decrease toward the upper grades of the 
high school. How much of these average gains is the result of formal 
learning and how much results from progressive elimination of the 
scholastically less able cannot be determined from these facts and figures. 
It is worthy of note, however, that the number of pupils for both tests 
is approximately 100 per cent greater in the ninth than in the twelfth 
grade. Obviously the inference is not far fetched that a considerable 
part of the gain is due to progressive elimination of the less able pupils. 

Perhaps of even greater educational and social significance is the 
enormous amount of overlapping of the various grades. Tables III and 
IV give the facts? in summary. They indicate the startling fact that 
from two-thirds to four-fifths of the pupils of a given grade could be 
exchanged for an equal proportion of pupils in the grade next above or 
below, without changing the level or spread of measured achievement; 
that roughly one-half the pupils in grades two years apart could be inter- 
changed in the same way; and that forty per cent or more of high 
school freshmen could be interchanged with seniors. Truly the problem 
of providing for individual differences remains unsolved in the high 
schools as long as such conditions obtain. 


TABLE III. PER CENT OF PUPILS WHO RANK AT OR ABOVE 
THEIR PRESENT GRADE IN ENGLISH 


Understanding and appreciation 
Grammar and mechanics of prose and poetry 

Per cent Present | Grade ator| Per cent Present | Grade at or 
of grade of |above which of grade of |above which 
pupils pupils pupils rank pupils pupils pupils rank 

35 9 10 35 9 10 

28 9 11 28 9 11 

20 9 12 20 9 12 

39 10 11 42 10 11 

31 10 12 32 10 12 

41 11 12 41 11 12 


2These percentages were derived on the assumption of normal distributions and 
were from a table of areas under the normal curve. 


7 
: 


30 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE IV. PER CENT OF PUPILS WHO RANK AT OR BELOW 


THEIR PRESENT GRADE IN ENGLISH 


Grammar and mechanics 


Understanding and appreciation 
of prose and poetry 


Per cent Present {Grade at or} Per cent Present | Grade at or 

of grade of |above which of grade of |above which 

pupils pupils pupils rank pupils pupils pupils rank 
36 10 9 . 35 10 9 
40 ll 10 43 ll 10 
23 11 9 29 11 9 
42 12 ll 41 12 ll 
32 12 10 34 12 10 
20 12 9 22 12 9 


Table V gives data which utterly refute the notion that one high 
school is much like another, even when only rural consolidated high 
schools are considered, as is here the case. 


31 


EDUCATIONAL MEASUREMENTS CONFERENCE 


T £22-62 9% SI SF 608-08 Buidvayyood 
1:86" OF 61 $6-82 Or OL 61 611-08 380} 
L:9¢°T 06°ZT 68-CT ce Cl OCF soisAqg 
L:09'T CZ IT ee C6 LT FII-¢ J 
—Aiyaod puv ‘asoid 
jo sotueyoour pue 
uumMyjoo [ooyos pooyos jo | a100s Mel | jo o3uey | jo saqunn 
jo jo oduvy a's 


S'TTOOHOS dO SAHYOOS ADVAAAV AGNV SAUOOS MVU TidNd dO ALITIAGVIUVA AO NOSIUVdWOO V ‘A 


BY, 
= 
| i 
| 
iZ 
| 
| 
| 
| 
| 
| 
\|- 
| 
| — 
| 
| 
| 
| 
| 
| 
| 
| 


32 BULLETIN OF THE SCHOOL OF EDUCATION 


The most striking feature of this table is the extreme variability of 
school averages. In the average subject this variability is half as great 
as the variability of individual pupils. In some of the most widely taught 
subjects, indeed, the school average variability is nearly 70 per cent of 
the variability of all pupils in the sampling. This is true, for example, 
of first-year Latin, first-year algebra, plane geometry, physics, and first- 
year typewriting. 

A comparison of school averages in terms of the range of averages 
is also highly illuminating. It is evident from the table that the average 
of the lowest school is in most cases not far above the lowest pupil score 
in the state distribution, and that the highest school average is not far 
below the highest pupil score. 

There is no basis in the data here presented for isolating the various 
possible causes of the wide differences among schools. Certainly it is 
totally unwarranted to charge them solely to differences in effectiveness 
of teaching. Varying pupil ability, apart from teaching effectiveness, 
will account for a large fraction. Other conditions of the teaching situa- 
tion and beyond the direct and immediate control of the teacher and the 
school, such as pupil group attitudes, parental attitudes, community at- 
titudes, etc., doubtless play a part. Whatever the causes may be, there 
is here a most direct challenge to the schools and all concerned with them 
to attempt to change the present situation in a positive direction. It is 
my hope and belief that the state testing program for Indiana* will be a 
major aid to efforts in this direction. 


® Those interested in the details of the program for 1930-1931 may obtain from the 
writer a descriptive bulletin while the supply lasts. 


The Growing Demand for Research Workers 
in Bureaus of Educational Research! 


HENRY LESTER SMITH, Dean of the School of Education, 
Indiana University 


The Development of Educational Research as a Profession 


WHEN the child was made the center of educational philosophy in 
the two decades immediately preceding the twentieth century, educators 
began to study methods of making practical the theories of the phi- 
losophers. As early as 1904 Thorndike related statistical procedure to 
educational problems. Strayer and Elliott, in 1905, investigated financial 
phases of school administration. Stone published the first achievement 
test in 1908. Boise, Idaho, published the first school survey in 1910. 

From these germs, modest and sporadic in their beginnings, there 
has developed one of the most significant and powerful movements in 
modern education; namely, the organization of educational research 
bureaus and the growth of a new profession. Beginning in 1912, when 
Baltimore, Maryland, established the first city school research bureau, 
the growth of such bureaus has been phenomenal. The Educational 
Directory of 1931, published by the Office of Education, lists a total cf 
156 city school research bureaus; 26 state department of education 
bureaus; 9 state educational association bureaus; 33 university and col- 
lege educational research bureaus; 15 teachers’ college and normal school 
bureaus; and 8 research bureaus in child development; a total of 247 
educational research bureaus in the United States. 

This indicated rapid increase in the number of research bureaus in 
the educational world points to the development of a new profession com- 
parable in character to that of the research expert in any large industrial 
corporation. In fact, one state, California, has gone so far as to issue 
a certificate in educational research and guidance, and to set up specific 
scholastic and experience requirements for the certificate. 

With the development of any new profession there arise, in the minds 
of those interested in preparing for employment therein, certain ques- 
tions of vital importance in the decision to enter or not to enter the 
profession. What opportunities does the field offer as to number of posi- 
tions available, salaries to be expected for various levels of the work, 
working conditions, type of work to be done, and security of tenure? 
What qualifications seem to be necessary for success in this field includ- 
ing personal characteristics, length and type of scholastic training, and 
previous experience? 

Source of Data 

This study is an attempt to determine these opportunities and quali- 

fications for employment in educational research bureaus. An inquiry 


1The author is indebted to Mr. F. R. Noffsinger, assistant in the Bureau of Co- 
operative Research, for much of the data in this study. 


(33) 


VERSITY LIBRARY 
a 
a 
| 


34 BULLETIN OF THE SCHOOL OF EDUCATION 


was sent to a list of research bureaus compiled from Chapman’s study 
and from research reports contained in the Educational Research Service 
of the Department of Superintendence and Research Division of the Na- 
tional Education Association Circular No. 9, 1930. A total of 74 bureaus 
replied. They were distributed as follows: 56 cities, 6 teachers’ colleges, 
5 universities, 3 state departments of education, 1 national bureau, and 
3 bureaus closely related to the educational field. To each of these 74 
bureaus, willing to codperate further in the study, was sent a second 
inquiry concerning the training, age, salary, experience, type of work 
done, and personal characteristics of all full-time and part-time em- 
ployees. To this second inquiry 40 bureaus replied. Of the 40, 30 were 
city bureaus and 10 were classed as higher research bureaus including 3 
state teachers colleges, 4 state universities, 2 state departments of edu- 
cation, and 1 organization closely related to education. An average of 
7 employees for each bureau was represented in the reports of 273 staff 
members received. A summary of the replies of the first 54 bureaus 
responding to the first questionnaire and the first 22 bureaus responding 
to the second questionnaire needed to be changed only in a few minor 
details when the complete summary was compiled from 74 first question- 
naire replies and 40 second questionnaire replies. This fact seems to in- 
dicate that, altho the sampling used as a basis for this study is only 30 
per cent of the total of 247 research bureaus for the first inquiry and 
16 per cent of the total for the second inquiry, the results are sufficiently 
representative to warrant valid conclusions. 


Opportunities in the Field 


The 273 replies to the second inquiry represented three groups of 
research bureau employees: directors, research assistants and clerical 
workers. Assistants are defined as being staff members whose work is 
distinctly research in character and who, on the whole, require some 
specific training to enable them to hold their position. Clerical workers 
are those whose duties are largely stenographic or secretarial in nature. 
The 273 positions represented were distributed as follows: 38 directors, 
126 research assistants, and 109 clerical workers. In other words, the 
typical bureau consists of 7 members: a man director; 3 research as- 
sistants, 1 man and 2 women; and 8 clerical workers, all women. Using 
this typical bureau as a basis for computation, there are in the 247 re- 
search bureaus listed by the Office of Education something like 247 di- 
rectors, 247 men research assistants, 494 women research assistants, and 
741 women clerical workers—a total of over 1,700 positions. That this 
number of available positions is rapidly increasing is indicated in two 
ways: first, the number of bureaus is increasing in spite of the fact that 
there is a present tendency to link the research function with the work 
of the assistant superintendent in city school systems; second, the num- 
ber on the staff of the research bureaus is increasing. In 1922 the five 
city bureaus of Oakland, Denver, Cleveland, St. Louis, and Los Angeles 
reported a total of 43 employees. These same bureaus in 1930 reported 
a total of 71 employees, an increase of 65 per cent in the size of the staff. 

Salaries for work in research bureaus are, on the average, fairly 
satisfactory. In computing the selaries from the questionnaire replies 
some difficulty was encountered. In some cases the directors and re- 


tS 


EDUCATIONAL MEASUREMENTS CONFERENCE 35 


search assistants were employed to do one or more kinds of work in addi- 
tion to the work done for the bureau. The salary given, then, was for 
the entire work done for the school system or the institution. Where the 
proportion of the salary for bureau work could not be ascertained, the 
salary was used as given. In some cases the salary was given on the 
hourly basis. In such cases the annual salary was determined by multi- 
plying the hourly wage by the number of hours the bureau remained open 
per week multiplied by fifty weeks. Thus all salaries were reduced to the 
annual basis. Table I represents the distribution of salaries for the 273 
research bureau staff members. From this table it can be seen that the 
median salary for men directors of city research bureaus is $4,400 with a 
range of from $2,500 to $7,050; for women directors in city bureaus the 
median salary is $3,000; and for directors of higher research bureaus the 
median salary is $4,250. Men research assistants receive a median salary 
of $2,000 and women research assistants obtain a slightly lower median 
salary, $1,925. Clerical workers in research bureaus receive a median 
salary of $1,200. 

There is a distinct relationship between amount of scholastic train- 
ing and salary. In city bureaus directors holding the doctor’s degree 
receive an average salary of $4,735 while those without the degree re- 
ceive an average salary of $3,875. If all 38 directors studied are con- 
sidered, those holding the doctor’s degree receive an average salary of 
$4,445 and those without the degree have an average salary of $3,910. 
In the higher bureaus nine of the 38 research assistants studied hold the 
doctor’s degree and receive an average salary of $4,635 while the average 
for the remaining 29 research assistants is only $1,295. 


TABLE I. DISTRIBUTION OF SALARIES OF STAFF MEMBERS 
IN 40 EDUCATIONAL RESEARCH BUREAUS 


30 City Bureaus 10 High Bureaus 40 Bureaus 

Salary Men Women Men Women Men Women 
IDI RIC] DI RIC 

4,000-4,499..... 2 BF ve 1 3 5 
3,000-3,499..... 1 4/.. 1 6 1 1 it 5 2 7 1 8 1 
2,500-2,999... 3] 6 1 2) 2 3/ 6] 1 
2,000-2,499... 1 1 2; 2 -- | 18) 3 
1,500-1,999.. . 2 4); 4]... 6 1] .. | 14] 21 
1,000-1,499... as 1 2 | 27 és 15] .. os 1 7 | 42 
500- 900.....) ..].. = 5 | il 3/14]... 8 | 25 
| 15 | .. 3] 5 3] 5 15| 6 
Not given.......} 1 1 1 1| 3 3] 1 2] 3 
Total.......| 22] 20} 2] 7] 62] 21 5] 17} 40) 31) 41 7| 85 

rat. 


a 
| | | ne 


36 BULLETIN OF THE SCHOOL OF EDUCATION 


In this and following tables the symbols D, R, and C refer to di- 
rectors, research assistants, and clerical workers respectively. 

This table is read as follows: One man director of a city research 
bureau received a salary of over $7,000; one man director of a city re- 
search bureau received a salary of between $6,500 and $6,999; one man 
director of a city research bureau and one man director of a higher re- 
search bureau, making a total of two men directors of research bureaus, 
received salaries between $6,000 and $6,499; etc. 

On the whole salaries for all three types of work are considerably 
higher in city bureaus than in higher institution bureaus. 

Working conditions in research bureaus are not as desirable as they 
might be, but there are indications that rapid improvements are being 
made. As is to be expected in any newly developed department of an 
organization the work must, for a time, be conducted in whatever type 
of building space is found available. However, as city school systems 
see the need for special facilities for the conduct of educational re- 
search and build new administrative buildings to house such departments, 
they are able to plan specifically for adequate research bureau space and 
equipment. 

The median number of office hours per week for the 40 bureaus 
studied is 40 with a range of from 33 to 48 hours per week. The number 
of office hours per week for clerical workers is slightly more than for 
other types of research workers, but research assistants and especially 
directors work many hours overtime. The median time given would 
provide for a daily schedule from 8:30 to 12 m. and from 1 to 5 p.m. 
with Saturday afternoon free. 

Directors of research bureaus range in age from 26 to 52, with the 
median age of 42. The median length of time which these directors have 
held their present position is five and a fourth years. Since the median 
date of establishment of the forty bureaus studied is 1925, it is evident 
that there have been very few changes in the directorship of research 
bureaus. The median director secured his position at the age of 37. 

In the typical bureau there are three research assistants, one man 
and two women. The median age of the men is 34 with a range in age 
of from 20 to 75. The median age of the women is 31 with a range in 
age of from 19 to 64. The men have a median tenure of two and one- 
half years; the women, three years. 

The Q,, median, and Q; ages of the clerical workers are 21, 23, and 
27 respectively representing the ages of the three clerical workers in the 
typical bureau. The first has been employed by the bureau one year; 
the second, three years; and the third, almost five years. 

It is difficult to determine with any reliability the status of tenure of 
research workers because of the short period during which many of the 
bureaus have operated. 

In order to determine the type of work done by research bureau staff 
members, a checking list of types of work was included in the second 
questionnaire. On the basis of the results of the checkings indicated in 
Table II, the types of work may be classified into five groups: those 
done by (a) directors, (b) directors and research assistants, (c) re- 
search assistants, (d) research assistants and clerical workers, and (e) 


- i 


EDUCATIONAL MEASUREMENTS CONFERENCE 37 


clerical workers. In the following lists types of work done by each of 
these groups are arranged in the order of frequency of mention: 

(a) Directors: writing research reports, giving information 
and advice, answering questionnaires, supervising research projects, 
giving lectures, preparing record forms, preparing questionnaire 
forms, training others to do research, administering the budget, 
test-making, answering correspondence, holding conferences with 
field men, ordering supplies and filling orders for supplies, testing, 
analyzing printed material, statistical work, preparing study out- 
lines and lectures, and editing publications. 

(b) Directors and research assistants: giving information 
and advice, writing research reports, testing, answering question- 
naires, supervising research projects, doing statistical work, prepar- 
ing record forms, giving lectures, preparing questionnaires, holding 
conferences with field men, answering correspondence, and analyzing 
printed material. 

(c) Research assistants: testing, scoring test papers, giving 
information and advice, writing research reports, tabulating, doing 
statistical work, answering questionnaires, making graphic illustra- 
tions, preparing record forms, supervising research projects, filing, 
making bibliographies, giving lectures, preparing questionnaires, 
holding conferences with field men, annotating bibliographies, 
answering correspondence and analyzing printed material. 

(d) Research assistants: testing, scoring test papers, giving 
papers, tabulating, filing, doing statistical work, and answering cor- 
respondence. 

(e) Clerical workers: typing, doing stenographic work, filing, 
tabulating, mimeographing, scoring test papers, doing statistical 
work, calculator operating, ordering supplies, and answering cor- 
respondence. 


Two types of work, statistical work and answering correspondence, 
are shared by all three groups of workers. One type of work, that of 
ordering supplies, is shared by directors and clerical workers. Six activi- 
ties listed in the questionnaire received so few checkings either by the 
entire number of staff members or by either of the three groups that 
they could not be considered general activities. They are: preparing ex- 
hibits, planning school buildings and building programs, indexing refer- 
ences, locating and checking material in the library, taking care of the 
library, and operating a Hollerith machine. There was no indication that 
the type of work done by each of the three groups of workers in city 
bureaus was different from that done by the same groups in the higher 
bureaus, nor that the work done by men was different from that done 
by women. 


= 

3 
= 

i 


BULLETIN OF THE SCHOOL OF EDUCATION 


38 


pue 


UBUWIOM OE OY} UT 


2SMOT[OF SB SI 


| pues ueyy | pue | usp jo adAy 
or 
OL OF | OL) AND OF OF AND 08 


HOUVASAY TVNOILVONGGT 0F SHAAOTAWA AX ANOG AO AdAL ‘Il ATAVL 


o 
| | | | 
| | 


EDUCATIONAL MEASUREMENTS CONFERENCE 39 


Requirements 


Three types of requirements for work in educational research bu- 
reaus were investigated in this study: personal qualities, training, and 
experience. Each of these three types of requirements was investigated 
from two viewpoints: first, what do directors actually require of ap- 
plicants for positions as qualifications considered essential to success; 
and second, to what qualifications do directors ascribe the success of those 
already employed in research bureaus? 

Personal Qualifications. A total of 54 of the 74 research bureaus 
responding to the first inquiry stated that some kind of personal quali- 
fication was required of applicants for positions in bureaus. Ability to 
coéperate with others, especially teachers, principals, and supervisors, 
is the outstanding personal characteristic designated by directors as es- 
sential. Table IV shows the frequency of mention of personal char- 
acteristics stated by the 54 directors as being required of applicants. 
This tabulation seems to indicate that there is little agreement among 
directors as to what personal characteristics are really essential. 


TABLE III. QUALIFICATIONS REQUIRED OF APPLICANTS 
FOR POSITIONS AS STATED BY DIRECTORS OF 
74 RESEARCH BUREAUS 


Scholastic 
Personal training rience 
Type of Bureau qualifications | qualifications | qualifications | Total 
Yes! No| Blank} Yes| No | Blank| Yes| No | Blank 

41 | 12 3 | 48] 5 3 | 32] 15 9 56 
Teachers colleges... .. 5| 1 eer 6 
Universities.......... 3] 2 a)... 1 1 5 
1 1 1 1 
Miscellaneous......... 2 1 3 a 1 2 3 

54 | 16 4 7 4 | 41] 21 12 74 


This table is read as follows: Of the 56 city bureaus reporting, 41 
stated that certain personal qualifications were required, 12 stated that 
no such qualifications were required, and 3 failed to answer the question. 


pels 

q 


40 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE IV. FREQUENCY OF MENTION OF PERSONAL CHAR- 
ACTERISTICS STATED BY 54 DIRECTORS AS BEING 

REQUIRED OF APPLICANTS FOR POSITIONS 

IN RESEARCH BUREAUS 


Personal characteristics 


41 
Cities 


5 Teach- 
ers 
colleges 


Univer- 
sities 


2 
States 


1 Nat- 
ional 


2 
Miscel- 
laneous 


54 
Total 


Ability to codperate. . 
No qualifications given. 
Scientific attitude...... 


Aptitude for research 
Scholarship— 
Intelligence.......... 
Porsomality............ 
Habits of accuracy..... 
Initiative and_leader- 
ship—imagination. . .. 
Dependability.......... 
Helpfulness............. 
Temperament— 
emotional poise. . 
Executive and super- 
visory ability........ 
Ability to meet people 
Attractive 
Industrious. . 
Good judgment. 
Adaptable to any situa- 
Habits of system....... 
Personal qualities of 
first-class teacher... . 


14 
12 
4 
8 


o 


— 


This table reads as follows: 


14 of the 41 city school research bu- 


reaus stated that ability to codperate with others was a required qualifi- 
cation for applicants for positions, 2 state departments of education, 1 
national research bureau, and 1 related bureau, making a total of 18 of 
the 54 bureaus reporting having such a requirement. 
In order to determine the personal qualifications to which directors 
ascribe the success of those already employed in research bureaus, the 
following checking list was used for each of the 273 employees: 
Honesty (integrity, sincerity, etc.) 
Perseverance (determination, “stick-to-it-ive-ness,” etc.) 
Neatness (tidiness, systematic arrangement, etc.) 


2 1 
1 
| 1 1 
| 
Self-confidence......... | | 
Reasonable speed in ex- 
| 


EDUCATIONAL MEASUREMENTS CONFERENCE 41 


Open-mindedness (willingness to see both sides of a proposition, 
tentative judgment) 

Broad-mindedness (fairness, impartiality, justice, tolerance, etc.) 

Adaptability (flexibility, teachableness, versatility) 

Courtesy (graciousness, mannerliness, refinement, politeness, etc.) 

Poise (calmness, deliberateness, self-confidence, self-reliance, dignity, 
etc.) 

Initiative (originality, enterprise, resourcefulness) 

Efficiency (expertness, accuracy, system, ability) y 

Responsibility (trustworthiness, accountability, dependability, etc.) 

Loyalty (dutifulness, fidelity, etc.) 

Service (helpfulness, unselfishness, sympathy, usefulness, codpera- 
tion, etc.) 

Sportsmanship (fair-mindedness, courtesy to opposition, square deal, 
etc.) 

Tact (the ability to deal with others without giving offense, diplo- 
macy, ete.) 


Table V shows the results of the checkings. On the basis of this 
tabulation success in research work by those already employed has been 
dependent upon the possession of the following personal characteristics 
in order of their importance: Responsibility, efficiency, perseverance, 
honesty, courtesy, adaptability, neatness, loyalty, and service. When the 
research employees are classified into the three groups—directors, re- 
search assistants, and clerical workers—the following personal charac- 
teristics have been found essential in order of their importance: 

Directors: initiative, open-mindedness, courtesy, tact, responsi- 
bility, perseverance, adaptability, efficiency, and service. 

Research assistants: perseverance, responsibility, open-minded- 
ness, honesty, courtesy, loyalty, adaptability, efficiency, service, poise, 
broad-mindedness, and initiative. 

Clerical workers: neatness, efficiency, responsibility, honesty, 
perseverance, loyalty, adaptability, courtesy, and service. 


Neither the type of bureau nor the sex of the employee made any 
significant changes in the order of the requisite personal qualifications 
for the three levels of positions. 


4 
| 


BULLETIN OF THE SCHOOL OF EDUCATION 


42 


— 

| GOI] 921) 88 | | 2 | | | OF | ZT is [830], 
ZI Pele lle le IV 
06 lerleriz le lp |} le le 
26 8% | | ZI | 8% | 28/8 |T astog 
| | 09] | 8h | 9F | T | IL | Aysouoy] 

OP snvoiIng [830], OF snveing OT sneaing 0g 


GHL ATHAISNOdSAA ONIAG SV AM CALVLIS SOLLSIUALOVUVHO TVNOSUGd ‘A ATAVL 


EDUCATIONAL MEASUREMENTS CONFERENCE 43 


Training Qualifications: Length or Level. A total of 63 of the 
74 research bureaus responding to the first questionnaire stated that 
certain definite scholastic training is required of applicants for positions 
in research bureaus. Forty of the 63 bureaus gave the length or level 
of training required. Two bureaus require the doctor’s degree for re- 
search assistants and directors; 18 require the master’s degree; 12 re- 
quire the bachelor’s degree; and 6 city bureaus require a teacher’s license. 
The remaining bureaus require graduate study in education except for 
strictly clerical workers. 

When the replies of the 273 employees in 40 research bureaus are 
investigated, it is found that all except three are high school graduates. 
All 38 directors hold the bachelor’s degree. Only 4 of the 41 men re- 
search assistants have not received the bachelor’s degree, while 33 of the 
85 women research assistants do not hold the degree. Of the 109 clerical 
assistants only 14 have the bachelor’s degree. 

Thirty-four of the 38 directors hold the master’s degree. Of the 
126 research assistants 54 have received the master’s degree. Only one 
of the 109 clerical workers holds the master’s degree and this employee is 
in a university bureau. 

Nineteen of the 38 directors have received the doctor’s degree. Seven 
women directors are among the 19 who do not hold the degree. Only one 
of the 9 directors of higher bureaus does not hold the degree while 19 of 
the 29 directors of city bureaus do not hold the degree. Of the 126 re- 
search assistants only 10 hold the doctor’s degree and 9 of these 10 are 
in higher bureaus. 

Of the total of 29 employees who hold the doctor’s degree 11 have 
received the degree since they began work in their present position and 
7 others received their degree about the time they began to work in their 
present position. 

About 25 per cent of the 109 clerical workers were found to have 
been graduated from commercial courses. 


Specific Courses. A total of 43 of the 63 bureaus requiring specific 
scholastic training stated specific courses required of applicants for posi- 
tions. Since only a few stated requirements for clerical workers, that 
group will not be considered here. On the basis of questionnaire replies, 
applicants for research positions in educational bureaus are required to 
present specific course training in the following fields, stated in order of 
frequency of mention: tests and measurements, educational statistics, 
psychology, research methods and techniques, school administration, 
supervision, education, guidance, recent literature on research, social 
sciences, proof reading, and English. 

An attempt was made to determine the specific courses taken by 
the 273 research employees which, in the opinion of the directors, seemed 
to account for success in research bureau work. A checking list of 
certain courses, which seemed to bear some relationship to various types 
of educational research, was placed in the second questionnaire. Table 
VI shows the results of these checkings. Thirty-one of the 38 bureau 
directors think that courses in school administration and educational 
statistics are most important among scholastic courses listed in prepar- 
ing them for their positions. The course in tests and measurements was 


6—47732 


} 
i] 
| 
| 
a 
| 
ore 
| 
| 


44 BULLETIN OF THE SCHOOL OF EDUCATION 


designated as being the next most important, followed by research meth- 
ods, supervision, clinical psychology, and English. Directors stated that 
75 of the 126 research assistants could attribute their success in research 
work, so far as scholastic training is concerned, to their course in tests 
and measurements. The course in educational statistics and the course 
in research methods follow next in order. Directors also stated that 63 
of the 109 clerical workers may attribute their success to their com- 
mercial courses. 

Among the 126 research assistants, the course in school administra- 
tion was checked for the men more times than for the women, but the 
reverse was true for the courses in clinical psychology and English. 

A difference was found between the checkings for research assistants 
in city and in higher bureaus. City research assistants found greater 
use for the courses in clinical psychology, school administration, and 
supervision. Clerical workers in higher bureaus found greater use for 
the course in tests and measurements. 

When the entire 273 staff members are considered, tests and measure- 
ments and educational statistics were found to be the most useful courses 
followed, in order, by English, research methods, commercial courses, and 
school administration. 


| 


45 


EDUCATIONAL MEASUREMENTS CONFERENCE 


pus | pue | pue asunog jo 

TOL AND sneaing OF O1 SHO 08 


HOUVASAY AO AAVLIS 40 SSHOONS FHL 
ATHISNOdSAY ONIDA SV AM GALVLIS DNINIVUL OILSVIOHOS OIMIONdS ‘IA AIAVL 


= 

| 
| 

| 

| 

| 
| 

| 
| 
| 
| a 
| 
| 


46 BULLETIN OF THE SCHOOL OF EDUCATION 


Experience Qualifications. Of the 74 bureaus studied by means of 
the first questionnaire, only 41 require certain experience qualifications. 
Applicants for research positions in 29 bureaus are required to have 
teaching experience. No other experience qualification was found to be 
general. Other types of experience required in order of frequency of 
mention are: administration, research, supervision, testing, guidance, 
clerical, and social service. 

In actual practice it was found that only 36 of the 164 directors 
and research assistants did not have teaching experience, while only 17 
of the 109 clerical workers had taught. 

As for other types of experience, it was found that 22 of the 38 
research directors have had experience as principal, 11 as superintendent, 
9 as both superintendent and principal, 4 as both supervisor and princi- 
pal, and 1 as superintendent, supervisor, and principal. Of the 126 re- 
search assistants, 13 have had experience as principal, 11 as superin- 
tendent, 13 as supervisor, 4 as both superintendent and principal, 5 as 
both supervisor and principal, and 2 as superintendent, principal, and 
supervisor. Thirteen of the 38 research directors have had former re- 
search experience; 31 of the 126 research assistants have previously held 
research positions; and 58 of the 109 clerical workers have formerly had 
clerical experience. Previous experience as a statistician, librarian, 
editor, or counselor was indicated a few times on the returned question- 
naires. 

A checking list of types of experience which in the opinion of di- 
rectors accounted for success in research bureau work shows that 28 of 
the 38 directors claimed that their teaching experience seemed to have 
made them more efficient in research work and 28 also checked ad- 
ministrative experience as essential to success. The men research as- 
sistants had checked for them teaching, administrative, and statistical 
experience in the order given. Teaching and statistical experience were 
checked most often for women research assistants. Of the 109 clerical 
workers, 65 had office experience checked as being essential to successful 
work in a research bureau. 


Summary 


1. There are approximately 250 educational research bureaus in 
the United States, employing something like 1,700 persons, nearly 1,000 
of whom are engaged in work distinctly research in character. 

2. The number of such bureaus is rapidly increasing. 

3. The median bureau employs a director, a man research as- 
sistant, 2 women research assistants, and 8 women clerical assistants. 

4. Directors receive a median salary of $4,400, research assistants 
a median salary of $2,000, and clerical workers a median salary of $1,200. 

5. Salaries are higher in city school research bureaus than in 
higher bureaus, and are higher for men than for women. 

6. Research assistants and directors with the greatest amount of 
scholastic training receive the greatest amount of remuneration. 

7. Working conditions in research bureaus are not satisfactory but 
are gradually improving. 


| 

{ 

| 


| 


EDUCATIONAL MEASUREMENTS CONFERENCE 47 


8. Bureaus are open 40 hours per week, but employees holding the 
higher positions in bureaus work many hours overtime. 

9. Bureaus have not been established for a long enough period to 
determine the tenure of employees. 

10. The median age of directors is 42; of research assistants, 32; 
and of clerical workers, 23. 

11. Directors state that the ability to coédperate with others, espe- 
cially teachers, principals, and supervisors, is the outstanding personal 
characteristic required of applicants for positions in research bureaus. 

12. Directors state that the following personal characteristics of 
those already employed have been responsible for successful work in 
bureaus: responsibility, efficiency, perseverance, honesty, courtesy, adapt- 
ability, neatness, loyalty, and service. 

13. Directors state that graduation from college is required of ap- 
plicants for research positions. They also state that one or more years 
of graduate study are desirable. 

14. Directors state that 127 of the 164 research assistants and di- 
rectors are college graduates and 88 hold the master’s degree. 

15. Directors state that applicants for research positions are re- 
quired to have special training in tests and measurements, educational 
statistics, psychology, research methods, school administration and super- 
vision. 

16. Directors attribute success of those already employed in re- 
search bureaus to specific training in tests and measurements, research 
methods, clinical psychology, and English, for directors; tests and 
measurements, educational statistics, and research methods, for research 
assistants; and commercial training for clerical workers. 

17. Applicants for research positions are required to have teaching 
experience. 

18. Directors attribute success of those already employed in re- 
search bureaus to teaching and administrative experience. 


i 
re 


An Experiment with the Lecture Method 
in College Training 


J. R. SHANNON, Professor of Education, Indiana State Teachers College, 
Terre Haute 


ALTHO higher education came later than elementary and secondary 
education in the movement for scientific analysis and readjustment, it 
is now in the full swing of the movement. Methods of college instruc- 
tion are now being given their share of attention. The time-honored 
lecture method has come in for more than its share of attention. Most 
of the attention, however, has been in the form of tirades. A well-known 
college president has stated that the lecture method is probably the worst 
scheme ever devised for teaching college students. Followers of the so- 
called scientific procedure have been so persistent in their jibes at lectur- 
ing that it has become quite the style to take a fling at the method in 
hopes of sharing in the applause that accompanies the downfall of the 
traditional at the hands of scientific progress. The attack has been so 
persistent and militant that the public has begun to agree in the condem- 
nation. Audiences have been taught to titter when the lecture method 
is derided by one who may think he is original or clever. 

This popular pastime of the ultra-smart, which is probably similar 
in its psychology to the Ford stories of a decade ago or to the Scotch 
jokes of more recent years, prompted the writer to wonder whether the 
critics were basing their remarks on known facts or whether they were 
simply repeating what they had observed was well received when spoken 
by some other iconoclast on some previous occasion. Recalling that in 
his own college days he had sat thru the lectures of a certain professor 
under whom it seemed he had learned more and worked less than under 
any other professor he had ever known, the writer undertook to experi- 
ment. Being an educationist, the writer selected for his experiment 
two of his own classes in Principles of Secondary Education. These 
classes followed the same course of study and were taught by the same 
instructor during the winter quarter, 1931, at Indiana State Teachers 
College. The enrollment in one class was 36 and in the other it was 35. 
In each class the division between the sexes was about equal. In each 
class there was one student who had taken the course before under the 
same instructor but had failed to pass.* One class met at eight in the 
morning four days each week, and the other met at one in the afternoon 
four days each week. The intelligence of the two groups, as indicated by 
percentile rankings on the American Council psychological examination, 
was not equal, but it was found that this inequality did not affect the 
results of the experiment. The median intelligence percentile of the 
morning group was 60, and the median intelligence percentile of the 
afternoon group was 73. 


1 These students are number 1 in the morning group and number 27 in the afternoon 
group. (See Table I.) 


(48) 


a 
— =. 
| 
| 


EDUCATIONAL MEASUREMENTS CONFERENCE 49 


The course in Principles of Secondary Education consisted of the 
first sixteen chapters of The American Secondary School, by Koos, and of 
a number of supplementary references and lectures. At four places in 
the course the instructor assigned to one group a reference in the library 
and to the other he lectured on the same topic, following in his lectures 
quite closely the same wording as found in the written material. In all 
cases of library assignments there were sufficient copies so that no stu- 
dent was unable to make adequate preparation. All four topics included 
in the experiment were parts of the course, and they were introduced in 
connection with the consideration of the material in Chapters VIII, XI, 
XIV, and XVI of Koos’ book. The articles assigned were (1) “An Un- 
exploited Opportunity in the Six-Year High School,” School Review, | 
December, 1928; (2) “Direct Values,” in Inglis, Principles of Secondary 
Education, pages 388-94; (3) “Three Dimensions in Curricular Adapta- 
tion,” Education, December, 1930; (4) “Post-School Careers of High- 
School Leaders and High School Scholars,’ School Review, November, 
1929, and a sequential article, “The Correlation of High-School Scholastic 
Success and Later Financial Success,” School Review, February, 1931. 

Library assignments were chosen instead of text assignments for the 
experiment because the students to whom lectures were given would be 
less likely to find the written material. In no instance were the students 
in a lecture group told what the written reference of a lecture was. 
Three of the four assignments used were articles by the present writer. 
The reason for leaning heavily upon the works of the same writer were 
(1) that there would be less basis for the accusation of bias on the part 
of the lecturer when he was also the author read after, (2) that there 
would be less basis for the accusation that the lecturer did not know 
himself what was in the articles assigned and therefore was unable to 
give correct statements in his lectures or to prepare satisfactory tests 
on the material assiyned, and (3) that a greater similarity in style and 
presentation could probably be maintained between the written and oral 
accounts of a topic. 

One group of students was sent to the library for the first and third 
articles and lectured to on the second and fourth, while the other group 
was sent to the library on the second and fourth and lectured to on the 
first and third. The first and fourth assignments used in the experiment 
were researches and were statistical in nature, while the second and 
third were of a philosophical character. Thus each group was lectured 
to on a statistical topic and on a philosophical topic, and was also sent to 
the library to read on topics of the same types. 

The assignment to a library group consisted of simply announcing 
the reference to be read and stating that a test would be given on the 
same on a given day (one or two days later) without any class discus- 
sion. An exception to this was made in the case of the fourth assign- 
ment, when the students were told to give special attention to the last 
table in the first article of the assignment. The students probably knew 
from previous experience with the instructor that the examinations would 
be limited to matters of major importance in the assignments. The stu- 
dents were asked to note the length of time they spent in study on each 
reading assignment and to report the same. The average number of 


a 
| 


50 BULLETIN OF THE SCHOOL OF EDUCATION 


minutes reported for study on the four assignments were 81, 63, 56, and 
68, and the medians were 60, 60, 55, and 60. These measures compared 
to the number of minutes devoted to the lectures, 50, 35, 39, and 50, in- 
dicate a greater amount of time given to reading of a topic than to the 
lecturing on the same, but this measure is not wholly exact, for it was 
learned that some students included in their reports the time spent in 
reviewing their reading notes also. No attempt was made to measure or 
regulate the time spent by either group in reviewing notes. 

The students in a lecture group were told that they would be given 
a test on the lecture the day following. No notebook requirements were 
imposed in the course, altho the students were advised to take rough notes 
during lectures and to write these up in complete form afterwards. No 
effort was made at any time to see whether or not the students were 
following the instructor’s advice concerning note-taking.? Four students 
in each class used shorthand in noting the lectures. The instructor 
strived in his lectures to duplicate the presentation used in the written 
reports. He tried not to give any new material or to use any new ways 
of explaining the points except to use the blackboard to present statis- 
tics. The lecturer tried not to “load the dice” in favor of the lecture 
procedure. Infrequently a student would break into a lecture to ask a 
question.* 

Objective tests of the recall type were prepared by the instructor 
and placed in mimeographed form in the hands of the students. These 
were intended to cover only the important items in the topics under con- 
sideration. They provided for 32, 20, 35, and 48 responses, respectively. 
Recall types were used because they were thought to be more difficult and 
to measure higher degrees of learning than recognition tests. The tests 
measured principally comprehension and retention of the material con- 
tained in the readings and lectures. They were not tests of mere rote 
memory, however. Since the tests were not standardized, the time limits 
set were somewhat arbitrary. The instructor told each group of students 
on each test that there would be a time limit, but he did not state what 
it would be. He determined the time limit by having the first class tested 
each time to stop when it became apparent that ten of the group had 
finished. Then he gave the second class the same number of minutes 
allowed the first. The number of minutes used in giving the tests were 
15, 14, 20, and 28, respectively. Both tests on a topic were given the 
same day, thus making all “leaks” in favor of the afternoon group. (The 
writer does not think there were any “leaks,” however.) If a student 
was absent from class on the day of a lecture, or if a student was absent 
when a library assignment was made and came to class on the day of the 

2 Neither the lecture procedure nor the library-assignment procedure were new to 
the students. Also, many more tests than the four used in the experiment were given 
the students on assignments thruout the term. 

* These were students number 2, 7, 9, and 31 in the morning group, and students 
number 1, 3, 6, and 29 in the afternoon group. (See Table I.) 

*In order to give testimony on how nearly the lectures paralleled the written ac- 
counts, an expert stenographer was brought in to note the third lecture. Space does 
not admit the inclusion of the stenographic report of the lecture in this account, but 
the report is being held by the writer. The report shows the lecture to be very similar 
to the written article. -Also, copies of the tests used in the experiment are being held. 


| 
| 
| 


| 

| 


EDUCATIONAL MEASUREMENTS CONFERENCE 51 


test unprepared, he was regarded as being absent on the day of the 
test. 

It was intended originally that neither group be allowed to know 
that an experiment was in progress. However, an incident occurred to 
change the plan. It was discovered that, after the lecture on the first 
topic was delivered, two students in the lecture group had discovered 
the article in the library and read it in addition to hearing the lecture.* 
Thereupon, both groups were told what was being done and asked to 
play fair, which they did. After the second test the students were told 
that one teaching procedure was proving to be superior to the other, but 
no hint was given as to which it was. 

Complete statistical data of the investigation are shown in Table I. 
For each group of students are shown the sex of each student by the 
letters F' or M, the percentile rating on intelligence, the score made on 
each test of the experiment, and the score made on a standardized read- 
ing test. At the bottom of the table are shown a number of averages. 
By means of these averages, comparison between the two teaching pro- 
cedures can be made. 

When the average scores on the four tests for all the members of 
the two groups are compared, it is observed that in every instance the 
lecture group excelled the library group. The degree of superiority was 
from 11 per cent to 21 per cent. Altho the morning group was inferior 
to the afternoon group in intelligence, its members did better on the 
second and fourth tests. However, the afternoon group, when lectured 
to, showed still greater superiority over the less intelligent group. 

When the tests on the material of the first and fourth assignments 
(the researches) were prepared, some of the questions were made on 
the procedures of the researches and some were made on the conclusions. 
Assuming that the members of the classes, being undergraduates, should 
be more interested in the conclusions of the researches than in the pro- 
cedures followed, the writer made a special comparison of the success of 
the two groups on the portions of the tests relating to the conclusions. 
By this comparison of the two teaching methods, the lecture method 
again excelled each time and to a larger degree than on the tests as 
wholes. 


5 These students were numbers 6 and 18 of the afternoon group. 


| 
| 
| 
- 
& 
q 
q 


rad of oe 8 {4 WN 9€ 8¢ og II 8 WW 
08 1g 91 rat 8% 0g 6g 2 02 IZ 91 wed 
#9 IF ae 62d | 8% 6g zz 6d 
Or 69 02 WN oF St mS 61 02 W 
1g 6I LU If 61 19 IW 
= 1g 1g 6 $8 | oF ig 02 62 1d 
oF oe 9% 18 ra St 1% 02 1d 
Il £6 Lt 0g 6I 18 68 La 
9¢ IF 4 "Og og IF 9% 6I IZ 98 94 
0g 62 Ig II 96 oF St Ge 02 td 
zg OF Ig 02 8% 16 ed 1g ¢W 
8b se ze 66 zd zg 1g 6I tei td 
91008 (pee) (9103991) (pve) (911399) 01008 (240999) (peer) 
Surproy 4804 380} 4894 9804 0] syuepnys Surpeoy 9804 3804 3804 380} 900931 syuepms 
qunog | | puoveg | yunog | | pucoog | 


SINGGOALS AO SdNOUD OML AHL OL NAAIS 
SLSGL AO SLTASAY TVOILSILVLS AO ‘I 


52 


|_| 
| | 


LNAGOALS SdNOUD OML AHL OL NAAID 
SLSAL SLTINSAY IVOILSILVLS JO ‘I ATAVL 


ewe ore oes 8I eve ees peoqe 13g 
ZL ce 02 ze eee rad SF ce 0z 94008 
= 91008 (pee) (peed) | 91008 (peas) (pee) 
989} 4894 389} 4894 1994 389} 4894 9994 
qunog | | puosag | yanog | | puosag | 
= 
=) 
a 


3 
| 
| 
| 
] 
= 
| 
| 


54 BULLETIN OF THE SCHOOL OF EDUCATION 


In order to determine whether there were any sex differences in re- 
gard to the lecture versus the library-assignment procedures of teaching, 
the averaze scores of the males on one method were compared to those 
of the males on the other, and the females on one to the females on the 
other. Again, in every instance, the comparisons favored the lecture 
method. The degree of superiority of the lecture groups among the males 
ranged from 9 per cent to 27 per cent, and among the females it ranged 
from 3 per cent to 30 per cent. This wider range of superiority among 
the girls should not be attributed to sex differences. A study of the 
intelligence ratings will show that the girls in the morning group were 
distinctly lower than the girls in the afternoon group. The median in- 
telligence percentiles for the girls in the two groups were 46 and 85. 
The girls in the morning group excelled the girls in the other group 
only 3 and 6 per cent on the “lecture” tests, while the brighter girls 
excelled the morning girls by 20 and 30 per cent on their “lecture” 
tests. The median intelligence of the boys in one group was about the 
same as that for the boys in the other, the median percentiles being 67 
for the morning group and 65 for the afternoon group. In all tests of 
the experiment, the girls of both groups excelled the boys of their same 
groups with the single exception of the third test in the morning group. 

As a further check on the effect of intelligence as a factor in 
the relative merit of the lecture method versus the library-assignment 
method, the high one-fourth in intelligence of one class was compared to 
the high one-fourth in intelligence of the other class on each test, and 
the low one-fourth of one was compared to the low one-fourth of the 
other. Again every comparison favored the lecture method. However, 
the degree of superiority in the low one-fourths was greater than in the 
high one-fourths, the percentages of superiority in the low one-fourths 
being 26, 15, 19, and 21, as opposed to 6, 10, 17, and 9 in the high one- 
fourths. This latter difference should not come as a surprise to any one 
familiar with prevailing correlations between intelligence and reading 
ability. The correlations (Spearman Footrule Formula) between the 
rankings of the students on intelligence, as shown by the American 
Council psychological examination, and the rankings on reading, as 
shown by the paragraph test in the Nelson-Denny Reading Test, were 
.689 for the morning class and .772 for the afternoon class. 

It has been found by every comparison made that the lecture method 
proved superior to the library-assignment method, but that this superior- 
ity was more noticeable with the less intelligent students than with the 
more intelligent. This finding suggested the need for further data. 
Were the college students used as subjects of study in this investigation 
typical in their ability to read? To answer this question the students 
were given the Nelson-Denny Reading Test, Form A. The scores on the 
paragraph portion of the test alone were used.’ The standard median 
score on the paragraph portion of the test is 44. The median score in 
the morning group of students used in this investigation was 38, and in 


*Few reading tests have well-standardized norms for college juniors. The norms 
for college juniors on the test used were based on only 570 cases. A very few seniors 
were enrolled in the two classes used in this study, but they were not given separate 
consideration. 


| 
| 


EDUCATIONAL MEASUREMENTS CONFERENCE 55 


the afternoon group it was 46. The median in the two groups combined 
was 42, which score is only slightly below standard. Thus it is prob- 
able that the subjects used in the study were about normal in their 
ability to read and that the results of the study cannot be discounted be- 
cause of atypical subjects in this respect. 

The data from the reading tests furnished a foundation for some 
final comparisons between the lecture method and the library-assignment 
method as procedures in college instruction. Comparisons of students 
ranking highest on the reading test were made, and comparisons of stu- 
dents ranking lowest on the reading test were made. Again all compari- 
sons favored the lecture method. However, as in intelligence, the degree 
of superiority in favor of the lecture method was greater when the 
poorest readers were considered than when the best were considered. 
The percentages showing this, as found in Table I, are 19, 16, 25, and 
22, as compared to 17, 2, 20, and 13. 

A final conclusion to the study could be that by every comparison 
made the students taught by the lecture method excelled those sent to 
the library, but the degree by which they excelled was more noticeable 
among the less intelligent students and poor readers than among the 
more intelligent and good readers. 

One may wonder why sueh a situation would be true. A number 
of theories might be advanced to explain the superiority of the lecture 
method, but let the students who listened to the lectures speak. The 
reasons they gave will probably comprehend all that might be hypo- 
thetically proposed. 

After all tests of the experiment were over, but before the students 
were informed of the outcome of the investigation, the students were 
asked to indicate which method, if either, they liked better, and to give 
all their reasons for the preference they expressed. Of the 71 students 
in the two classes, 67 preferred the lecture method, 2 preferred the li- 
brary-assignment method, and 2 had no preference. Of the two favoring 
the library-assignment method, one was in the top one-fourth and the 
other was in the bottom one-fourth in intelligence, and of the two ex- 
pressing no choice, also one was in the top one-fourth and the other in the 
bottom one-fourth.’ The reasons given by the students were carefully 
studied by the writer, and, after being translated into common terms, 
they were classified under a number of headings shown in Table II. 
Although a single student often gave the same reason in different 
language more times than once, only one time per reason per student 
is reported in Table II.* The reasons given by the students for preferring 
lectures to reading are classified according to the intelligence rating of 
the students giving the reasons. It will be noted that for most reasons 
listed in Table II there is a fairly equitable distribution of frequencies 
among the three intellectual levels considered. Since so few students 
voted for anything except the lecture method and no single reason was 
given more than once, no report is given here of their reasons. 


™The two favoring the library-assignment method were numbers 3 and 82 of the 
afternoon group. The two expressing no preference were number 1 in the morning 
group and number 30 in the afternoon group. 

8A student would give as one reason, for example, that in the lecture method the 
essentials were stressed, and then as another reason say that in the library-assignment 
method he had to read much irrelevant material. 


a 

| 
| 
| 
| 
| 


56 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE II. REASONS THAT STUDENTS PREFER 
THE LECTURE METHOD 


Frequency of mention 
Reasons given by students 
High 4%4|Middle Low 4| Total 

Important points are emphasized and are 

more readily seized by students with- 

out reading irrelevant material....... 14 28 13 55 

Saves students’ time................... 10 12 3 25 
Students can ask questions on points 

Lecturer may give more examples and 

other material than is included in 

Lecturer gives topics more psychological 

arrangement and presentation than 

found in written articles.............. 4 5 1 10 
Lecture method is more human in that 

it gives the personal touch............ 1 4 3 8 
Students can remember better.......... 2 5 1 8 
Lecturer uses simpler language than is 

found in written articles.............. 1 5 2 8 
So-called “‘ear-mindedness’”’............ 1 4 5 
Oral presentation more impressive...... 1 2 A 3 
Students are given an opportunity to 

react in the course of a lecture and thus 

stimulate their thinking.............. 1 = 2 3 
Tables and graphs are presented in lec- 

tures by a combination of oral and 

visual methods which are superior to 

visual alone as in reading............. 1 2 3 
Lecture is fairer, since all hear for the 

same number of minutes.............. 3 3 
Easier to take notes on lecture than on 

Difficulty of finding library references. . = 2 2 
1 1 2 
Trains students in attending lectures and 

taking notes on same................. 1 1 2 
Lecturer never makes assumption as to 

what the students already have studied 1 1 
Instructor gets a check on himself..... ee a 1 1 
In writing up lecture notes later the 

student gets a review................. oy 1 1 
Lecture method enables the student to 

learn how to lecture.................. p 1 1 


The extent to which generalization is justifiable on the basis of the 
experiment just described is conjectural. All that is certain is that with 
the given instructor, the given students, the given topics, and the given 
procedures, the lecture method proved conclusively superior to the li- 
brary-assignment method in so far as the given tests measuring compre- 


i 
| 


| 
| 
| 
| 


EDUCATIONAL MEASUREMENTS CONFERENCE 57 


hension and retention of subject-matter are indicative, that the less in- 
telligent students and the students with poor reading ability found the 
superiority of the one method over the other to be greater than did the 
more intelligent and better readers, and that with the great majority 
of the students regardless of intelligence the lecture method was the more 
popular. The results may be different with another instructor or with 
other students, other school subjects, other topics, other procedures, or 
other tests. We haven’t the data to say. It seems certain that a teach- 
ing procedure which one instructor finds successful might not be found 
so by another instructor. There probably are no teaching procedures 
which all instructors will find to be good or which all will find to be bad. 
We must recognize individual differences among instructors as well as 
among students. No one has a right to assert that the lecture method is 
the worst scheme ever devised for teaching college students, nor has he 
a right to assert that it is the best. For some instructors it probably is 
the worst, and for others it probably is the best. There is nothing wrong 
with the procedure of lecturing, but with the performer; there is nothing 
wrong with the method, but with the methodist. Perhaps more instruc- 
tors could succeed with the lecture method if they would work as hard 
in preparing their lectures as they expect their students to work on 
library assignments. 

All that has been said in this report about the lecture method has 
presupposed college teaching. No data are available to show the ad- 
visability of lecturing in the high school or in the elementary school. 
There is a distinction here which some critics of teacher training have 
overlooked. These critics have condemned instructors in teacher-train- 
ing classes in colleges for not exemplifying in their college classes the 
teaching procedures and techniques which their students should follow 
in high-school or in elementary-school teaching. They have criticized the 
college instructors for not practicing what they preach. Now this criti- 
cism is not clever. In fact, it is asinine. Certainly the critics would not 
contend that the teaching procedures and techniques of a primary teacher 
should be the same as those of a teacher in the junior high school. Then 
why should they expect the teaching procedures and techniques in a col- 
lege to be the same as those in a high school? There is an equal dif- 
ference between the students in one case and those in the other. 

Most orthodox prayers to Jehovah end with the word “Amen.” 
Most orthodox reports of researches in education end with the words, 
“further study is needed.” The investigation just reported reaches a 
conclusion that is unorthodox, but perhaps the investigator can save his 
professional scalp if he admits that further study is needed. This he 
expects to make. 


| 
| 
| 
| 
i 
: 


The Psychological Examinations at the 
Indiana State Teachers College 


J. W. JONES, Director, Division of Research, Indiana State Teachers 
College, Terre Haute 


“I have my chance 

Each day there come to me some souls 

Unnurtured to the world. My opportunity, 

My work shall be to find their need 

And help survey a path 

That leads to the supply; 

Then give them learning as a life to live 

Not as a garment to be worn, 

Help them gain courage, endurance, fairness, inquiry.” ’ 


—John Bretnall. 


IN the earlier days of psychological testing at the Indiana State 
Teachers College not all students were required to take the tests, but the 
results obtained under the administration of the tests by Dr. R. A. 
Acher and Prof. E. L. Abell led to the adoption of a regulation requir- 
ing that all new students take the psychological examination at the time 
of entrance to the college. With the establishment of the Division of 
Research in 1927 the testing program was turned over to the Division. 
Those of you who are familiar with the work of the American Council 
on Education will realize that the history of this type of work at Indi- 
ana State Teachers College parallels to some extent the development of 
the psychological examination sponsored by the American Council and 
developed under the guidance of Dr. Thurstone. It is only natural then 
that the next step in the development of our program should be the 
establishment of relations with the American Council thru the adoption 
of the Council’s examination as the examination to be used with our stu- 
dents. The efforts of the Council to extend the use of the Thurstone 
examination had already resulted in its widespread use in many colleges 
and universities and its gradual adoption in teachers colleges. 

A few of the teacher-training institutions in Indiana were using this 
test but no effort had been made to collect the results for Indiana into 
one report. Consequently, the next step in the development of our pro- 
gram was to encourage the uniform use of the American Council psycho- 
logical examination in all teacher-training institutions in the state. 
Fortunately the State Board of Education was interested in the analysis 
of data dealing with first-year students and requested that all teacher- 
training institutions file a complete record of their first-year students. 
Among the data asked for was a report of the students’ performance 


1Bretnall, John. “The Teacher.” Journal of the National Education Association 
20:119, April, 1931. 


(58) 


EDUCATIONAL MEASUREMENTS CONFERENCE 59 


on a psychological test. A letter was sent to all the colleges in Indiana 
pointing out the advantages of using the same test in all the colleges. 

As a result of these efforts seventeen colleges in Indiana used one 
form or another of the American Council psychological examination last 
year. The data of the several colleges were compiled and reported back 
to the colleges from a central office. The same procedure has been used 
this year and a report compiled by one of the colleges. We are expect- 
ing to continue the same codperative program again next fall. 

Considerable study has been made by the American Council of the 
reliability of the psychological examination. The 1930 Manual reports 
a coefficient of reliability of .941 for the 1928 edition and .943 for the 
1929 edition of the test. The examination has been used in approxi- 
mately three hundred colleges and universities with over 150,000 stu- 
dents. 

The tests made of the validity of the American Council psychological 
examination “are satisfactory for an intelligence test. The principal fac- 
tors that influence these coefficients are the reliabilities of the tests and 
and of scholarship grades as well as the range of ability represented by 
the freshman class. If the scholarship grades in a college are not reli- 
able or discriminating, the correlation between scholarship and tests 
scores will be lowered. This is found occasionally in those colleges in 
which the grades are assigned by standards that vary considerably from 
one instructor to another or from one department to another. When the 
grading standards of a college are maintained with fair uniformity 
among the different courses, the correlations between the tests and grades 
should be between .40 and .60. 

“Another important factor, not so often noted, is the range of ability 
of the freshman class. A college that selects its students carefully at the 
time of admission must expect to find lower correlations between tests 
and scholarship than the college in which selection of freshmen is less 
rigorous. The correlations are, in fact, highest for the colleges in which 
the range in ability is the greatest. From this it follows that, in gen- 
eral, the correlations between tests and scholarship will be lower in the 
Eastern colleges that select students largely by the New York State 
Regents examinations or by the College Entrance Board examinations 
and in the privately endowed universities than in the state universities 
where the selective process for the freshmen is not so severe. These 
comparisons may have no effect on the comparisons for seniors in the 
same colleges. It is largely a question of whether the selection of degree 
students should be made at the time of admission or during the fresh- 
man year. A low correlation between tests and scholarship may mean, 
therefore, a loose grading system or lack of uniformity of standards in 
the different departments, or it may mean a very highly selected fresh- 
man class from which the failures have already been eliminated by en- 
trance examinations or by other effective methods.’ ? 

The American Council reports in the April Educational Record the 
results of the examination based upon data of the previous fall. In many 
colleges these national norms are received too late for use during the 


? Thurstone, L. L. and Thurstone, Thelma Gwinn. “The 1929 Psychological Examina- 
tion.” The Educational Record 11:112-13, April, 1930. 


a 
| 


60 BULLETIN OF THE SCHOOL OF EDUCATION 


current session with the result that the colleges use percentile ratings 
based upon the performance of their own students. In Indiana we have 
been able to supplement this local rating with the percentile rating based 
upon state norms. However, the 1930 edition of the examination was con- 
structed so that the various parts were of practically the same degree of 
difficulty as the corresponding parts in the 1928 and 1929 editions, mak- 
ing it possible to compare scores directly. “It is therefore possible to de- 
termine with a high degree of accuracy a student’s percentile rank in the 
1930 edition as compared with several thousand students in other col- 
leges.” * 

There are at least two obstacles to the most efficient use of the re- 
sults of these examinations. These obstacles should be noted before pass- 
ing to some specific uses.: In the first place there is considerable skepti- 
cism abroad as to the value of I. Q. ratings and intelligence tests. While 
this paper is not concerned with the controversy between the pros and 
cons of the intelligence test movement, an observation is offered for con- 
sideration. 

The too-ready acceptance of the I. Q. as a measure of intelligence 
and the subsequent discovery of wide variation from performance-ability- 
achievement has caused the layman to question the value of the expert 
psychological researches in the nature of intelligence. To a very large 
extent this skepticism has reflected to the discredit of scientific investi- 
gation of intelligence. One result has been that psychologists who have 
long recognized the limitations of intelligence tests have proposed the use 
of the psychological examination in lieu of the intelligence examination 
and the percentile rating in lieu of the I. Q. It may be that as much 
harm is being done to the science of psychology in the use of the term 
“psychological examination” as was done to the scientific investigation 
of the workings of the human mind by the use of the word “intelligence.” 
It might be advisable to substitute developability-level*‘ examination for 
the psychological examination, retaining the use of the percentile as a 
measure of this developability-level examination since it is a readily com- 
prehended value indicating the rank of an individual in relation to others 
of the group. 

The second obstacle to the most efficient use of the results of these 
examinations is the failure of students to put forth their best efforts at 
the time of the examination. While in large groups the number of stu- 
dents so doing is probably offset by those exerting their best efforts, the 
ability of these few represented by their subsequent achievement in col- 
lege in spite of their low ratings has had a tendency to develop the same 
type of skepticism mentioned above. Yet, on the whole, it seems that the 
beneficial results from using the percentile in place of the I. Q. justifies 
its continued use. 

Fully recognizing these and other limitations we are using the re- 
sults of the psychological examination in several ways: 

Members of our faculty have access to the ratings of all students 
when they wish to consider these ratings as one of the factors in their 

#1930 Manual. p. 10. 


*I am indebted to the late Dr. H. H. Young of the department of psychology of 
Indiana University for this concept. 


EDUCATIONAL MEASUREMENTS CONFERENCE 61 


own researches. The paper by Mr. Lowell M. Tilson yesterday morning 
and the paper by Dr. J. R. Shannon yesterday afternoon are illustrations 
of particular uses to which our results have been put. 

The percentile ratings of the various students are supplied to the 
instructors for their information. Studies of the various uses of such 
ratings have been reported to the faculty from time to time and the fac- 
ulty has been encouraged to consider such ratings as one factor in the 
study of the individual student and his needs. We do not issue a complete 
student list with ratings to all members of the faculty, but rather sup- 
ply the individual instructor with the ratings of his students at his re- 
quest. No effort has been made to determine the uses to which the in- 
structors put such information, but we are inclined to believe that the 
instructor is utilizing this information in an analysis of his individual 
student’s abilities. 

There is a third use to which we put these results which might be 
classified as an administrative use. When the mid-term report of failures 
is filed with the deans, the percentile ratings of the failing students are 
also supplied the deans. We have found that in many cases the percen- 
tile ratings are such as to indicate that these students could do better 
work. The deans are better able to advise with the students as a result 
of these ratings than without them. 

We are attempting to study the personality traits of our student 
body and are feeling our way as to a good method of advising with stu- 
dents regarding their individual characteristics. At the present time we 
have been selecting those students who rank in the lowest quarter of a 
given group on several personality factors for immediate attention. 
These students are counselled with by the Dean of Men and the Dean of 
Women, who have available the psychological percentiles of the students. 
Here again we are considering the percentile ratings on these psycho- 
logical examinations as one factor in the analysis of the needs of our 
students. We are finding in many cases that some of the undesirable 
personality traits as checked by our faculty accompany low performance 
on the psychological tests as well as low performance in scholarship. We 
are not in a position to say to what extent the low rating in scholarship 
has affected the low rating on personality factors, but it presents an in- 
teresting problem for later investigation. 

Since this program has been in the hands of the Division of Research 
an increasing number of students have come to the office of the Division 
asking information concerning their “I. Q. ratings.” It has been one of 
our assumed obligations to clear up the idea in the minds of the students 
that these examinations result in an I. Q. rating. Rather, we have in- 
sisted that the student think of his percentile rating as his rank in the 
group that took the test at the time he did. We find that this has a salu- 
tary effect upon the student and gives us at the same time the oppor- 
tunity to encourage him to work up to his performance-ability level. For 
example, there was Mr. A., a freshman who came to the office recently 
to inquire his rating upon the examination. When he was told that his 
rating was 68, an expression of humiliation crossed his face. He asked 
if that were not a low I. Q. rating. He was informed that it represented 
his rank in the group of students who took the examination at the time 


we 
i 
| 
| 
| 
| 
| aq 
| 
a 


62 BULLETIN OF THE SCHOOL OF EDUCATION 


he did. This young man belonged to a fraternity which had a third 
quartile point of 64 with a median of 50 and ranked first in scholarship 
among the men’s organizations. Mr. A. ranked in the upper quarter of 
his fraternity. Mr. A. is on the special curriculum in our college. He 
was informed that the average freshman on the special curriculum had a 
percentile rating of 46. In other words, Mr. A. ranked among the high- 
est 25 per cent of his fraternity and 18 points higher than the average 
of the freshman class on the special curriculum. With this information 
before him, his feeling of humiliation at what he thought was a low rat- 
ing began to change to a feeling of confidence in his own ability. He was 
advised that he should be doing B work on his particular course, and 
with a feeling of pride, he said, “So far I have made 2 A’s, 4 B’s, and 2 
C’s.” lig 

A few days later there was Miss B., another freshman who was 
concerned about her rating. She had a percentile rating of 98. The 
average of her sorority was 70. This sorority ranked first in all organi- 
zations in the school in scholarship. 

Mr. C. was called to our attention by the mid-term failure reports. 
His percentile rating was 22, and he withdrew from school because of 
five failures. 

Miss D. had been in school some time but had had to withdraw on 
account of her health. Later, when she was able to return, she consulted 
with our Dean of Women concerning the advisability of returning to 
school. The Dean of Women, taking into account her scholarship record 
and psychological rating, recommended that she use every effort to re- 
turn. She is now in school and doing good work. She has a percentile 
rating of 45 and a scholarship record of 9 A’s, 25 B’s, and 10 C’s. 

Aside from personal contact cases, studies have been made to de- 
termine whether or not the natural elimination which occurs in college 
has resulted in the retention of a higher grade of student based on the 
psychological ratings made at entrance. The natural assumption would 
be that this would be the case. If you are willing to accept the measure 
of central tendency as one criterion for evaluating the several classes in 
the college, then the data in Table I show an increase in two measures of 
central tendency as the classes moved from freshman to sophomore to 
junior and to senior standing. Table I further reveals some interesting 
information concerning the measure of central tendency of the students 
on the various curricula at the Indiana State Teachers College. In 
general those students preparing for teaching positions in academic sub- 
jects in high school have the highest average percentile rating. 


| 
| 


EDUCATIONAL MEASUREMENTS CONFERENCE 63 


TABLE I. MEDIANS AND AVERAGE PERCENTILE RATINGS OF 
CLASSES 1934, 1933, 1932, 1931, 1930, BY CURRICULA ON PSY- 
CHOLOGICAL EXAMINATION 


Class of | Class of | Class of | Class of | Class of 
1934 1933 1932 1931 1930 
Quarter 
Me- |Aver-| Me- |Aver-| Me- |Aver-| Me- |Aver-| Me- |Aver- 
dian| age | dian} age | dian| age | dian] age | dian] age 
ELEMENTARY 
CURRICULUM 
36] 41] 51] 50] 56] 55 
Winter, 1930...... 40} 43] 50] 48] 58] 52] 48] 65 
Spring, 1930...... 37 | 40) 54] 51] 48] 47] 51] 59 
Fall, 2000......... 34} 39} 40] 45] 70] 66] 50] 51 
Winter, 1931...... 34| 41] 44] 44] 56] 55 
REGULAR 
] CURRICULUM 
Vall, 1020......... 67 | 62] 64] 60] 66] 60] 78] 70 
: Winter, 1930...... 71 63 | 64] 60] 66] 58] 7. 78 
Sprng, 1930....... 63 | 60] 72| 65] 66] 61] 68] 62 
i. = See 61 57 68 62 69 65 64 59 
Winter, 1931...... 63 | 54] 72] 66] 69] 63] 68] 64 
SPECIAL 
CURRICULUM 
Fall, 1929......... 47| 47) 44] 50] 53] 53] 58| 56 
Winter, 1930...... 48} 48) 48] 50] 57] 55] 60] 58 
Spring, 1930....... 51] 46| 47} 57] 56] 59] 56 
Fall, 1930.........) 43] 46] 50] 50] 50] 52] 64] 61 
Winter, 1931...... 44) 46] 52] 51] 49] 51] 59] 58 
ACADEMIC 
CURRICULUM 
Winter, 1930...... 51] 50] .. 
Spring, 1930....... 48 | 60] 56] 57 
ALL 
CURRICULA 
Fall, 1929......... 50] 49| 54] 52] 58] 56] 71] 63 
Winter, 1930...... 51] 51] 53] 52] 61] 56] 68] 68 
Spring, 1930....... 50} 50| 54] 53] 60] 58] 64] 59 
45| 47] 52] 58] 64] 60 
Winter, 1931...... 44| 47] 54] 52] 57] 56] 63] 61 
*Data lacking. 


| | | 
be 
. 


64 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE II. 


LEGE ON PSYCHOLOGICAL EXAMINATION 


MEDIANS AND AVERAGE PERCENTILE RATINGS 
OF JUNIOR COLLEGE, SENIOR COLLEGE, AND TOTAL COL- 


Junior college 


Senior college 


Total college 


Quarter 
Median | Average} Median | Average} Median | Average 

ELEMENTARY 

CURRICULUM 

 * ae 43 45 51 54 44 45 

Winter, 1930........ 44 45 56 55 46 46 

Spring, 1930...........] 46 46 49 49 47 46 

37 42 60 62 38 42 

Winter, 1931....... 38 43 56 55 39 43 
REGULAR 

CURRICULUM 

66 60 72 64 68 62 

Winter, 1930....... 67 62 71 68 69 64 

Spring, 1930...........| 69 63 67 62 68 62 

Fall, 1930.......... 64 59 67 62 65 60 

Winter, 1931....... 64 59 69 63 66 61 
SPECIA 

CURRICULUM 

Fall, 1929.......... 47 48 55 54 50 50 

Winter, 1930....... 48 48 59 56 51 51 

Spring, _.. eee 49 48 58 56 52 51 

47 48 56 55 50 50 

Winter, 1931....... 48 48 54 54 50 50 
ACADEMIC 

CURRICULUM 

76 62 82 72 64 

51 59 85 61 61 

Spring, 1930........... 60 59 . 85 61 63 

RRR 50 59 * * 53 60 

46 53 56 46 54 
ALL CURRICULA 

i}. eee 54 51 63 59 54 53 

Winter, 1930.......... 52 51 64 62 55 54 

Spring, 1930........... 52 51 62 59 55 54 

 f 48 49 61 59 57 52 

Winter, 1931.......... 49 49 60 58 52 52 


*Data lacking. 


| 
| 
| 
| 
| 


EDUCATIONAL MEASUREMENTS CONFERENCE 65 


Table II brings together the measures of central tendency of our 
students enrolled in the junior college as compared with all students en- 
rolled in the senior college. Here again we find that which is to be 
expected—those students with the highest percentile ratings have a 
tendency to remain in school longer than those students with low 
percentile ratings. 

We have a classification of students at Indiana State Teachers Col- 
lege known as drop-outs. Drop-outs are students enrolled in a quarter 
who do not re-enroll the following quarter. We have been interested to 
know whether there is any relationship between drop-outs and psycho- 
logical ratings. Based upon a study covering the last four quarters we 
have found that 30 per cent of the drop-outs in these four quarters had 
a percentile rating below 30 and that 30 per cent had a percentile rating 
above 60. In other words, we find a narrower range of percentile ratings 
included in a given percentage at the low end of the scale than we find 
at the high end. It should be remembered in this regard that many stu- 
dents drop out for a quarter to earn enough money to return later and 
finish their college work. It seems safe to assume that the individuals 
who return are those whose previous performance indicates possibilities 
of subsequent success in college. 


TABLE III. DISTRIBUTION OF DROP-OUTS* 


Number of drop-outs at beginning of 
Total 
Percentile Winter Spring Fall Winter | drop-outs 
quarter | quarter | quarter | quarter 

1930 1930 1930 1931 
13 14 14 14 55 
11 17 30 12 70 
10 14 40 16 80 
9 16 17 22 64 
8 9 29 16 62 
ER 15 10 31 16 72 
10 12 16 14 52 
ery 17 14 16 17 64 
10 13 22 11 56 
25 9 24 20 78 
128 128 239 158 653 
esa 42 58 55 52 52 
45 54 53 50 51 


Table should be read: of the students enrolled in the fall quarter, 
1929, 128 dropped out at the beginning of the winter quarter, 1930. 
Approximately one-third of these students had a percentile of 30 or less. 
Of the students enrolled in the winter quarter, 1930, 128 dropped out at 
the beginning of the spring quarter, 1930. Approximately one-third of 
these students had a percentile of 40 or less. 


* Drop-outs=Students enrolled in a quarter who did not re-enroll the following 
quarter. 


| 
7 
| j 
q 
| 
7 
| 


66 BULLETIN OF THE SCHOOL OF EDUCATION 


You will recall that in the statement from Dr. Thurstone made ear- 
lier in this paper the assertion was made, “When the grading standards 
of a college are maintained with a fair uniformity among the different 
courses, the correlation between the tests and grades should be between 
40 and .60.”* 

Based upon a study of the distribution of marks at the Indiana State 
Teachers College covering all quarters from the fall quarter of 1924 
thru the fall quarter of 1930 and including 101,052 marks, we have 
found that quarter after quarter the distribution of marks at the insti- 
tution has been rather uniform and follows fairly well the normal curve. 
It would seem that on the basis of Dr. Thurstone’s statement we should 
find the correlation between the psychological test scores and scholarship 
ratings of our students varying between .40 and .60. We have found that 
in our institution the correlations between scholarship and percentiles on 
psychological examinations range from .39 to .68. These data are sum- 
marized in Tables IV and V. 


TABLE IV. CORRELATIONS OF PSYCHOLOGICAL EXAMINA- 
TION RATINGS WITH OTHER VARIABLES FOR FIRST YEAR 
RECORD OF CLASS OF 1931—I. S. T. C. 


Regular Special 
Psychological test Elementary| college college Total 
correlated with: group group group 
.57 = .04 63 = .04 | .59 = .03 
Average high school record.| .48 = .03 | .37 = .04 44+ (04 | 44+ .02 
Average credit points in fall) .54 + .03 | .48 = .04 38 + .04 | .46 += .02 
Average credit pointsin year| .56 + .03 | .59 = .04 42+ .05 | .53 = .02 


TABLE V. CORRELATIONS FOR SELECTED GROUPS—I. S. T. C. 
PSYCHOLOGICAL RATINGS AND SCHOLARSHIP 


Coefficient of correlation 


Group considered 


Factors correlated 


r+ PE, N 
Organizations.......... Scholarship rankings 
and average psycho- 
logical percentiles....| .68 = .07 30 
logical percentiles....| .43 = .02 653 
‘‘Academic’”’ student 
Scholarship and psycho- 
logical percentiles....| .39 = .08 46 
Average coefficient for 
18 colleges........... Scholarship and _ test 
494 Range .32-.68 


* Thurstone, L. L. and Thurstone, Thelma Gwinn, op. 


| 
| 
| 
| 
| 
| 
‘ 
| 

| | | 

cit. 


EDUCATIONAL MEASUREMENTS CONFERENCE 67 


Another method of analyzing the relationship between scholarship 
and the psychological ratings has to do with the distribution of honor 
roll students on the basis of their psychological ratings. We have found 
that over a period of nine quarters, 74 per cent of the students making 
the honor roll have a percentile rating above 80 and that 52 per cent 
have a percentile rating above 90. The data for the various quarters are 
summarized in Table VI. 


TABLE VI. DISTRIBUTION OF PSYCHOLOGICAL EXAMINATION 
RATINGS OF HONOR ROLL STUDENTS* 


Fall | Winter| Spring} Fall | Winter| Spring| Fall | Winter| Spring Per 
Percentile 1927 1928 1928 1928 1929 1929 1929 1930 1930 | Total | cent 
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) 
OOS. cosvins 5 9 10 12 6 5 7 7 9 70 52 
| SR 2 3 5 4 4 4 2 4 2 30 22 
1 2 1 2 2 2 1 11 8 
Se 2 2 3 sie 1 1 2 1 1 13 10 
1 1 2 4 3 
aS 1 1 : 1 3 2 
1 2 3 2 
Number........ 9 16 21 18 12 15 13 15 15 134 100 
iiss 5000s 65-99 | 36-99 | 40-99 | 54-100) 55-99 | 39-99 | 67-99 | 25-99 | 25-100) 25-100 
Average......... 87 85 84 90 86 80 86 84 83 85 
Median......... 91 92 93 93 91 86 91 85 95 92 


Still another method of checking the relationship between scholar- 
ship and the psychological examination rating is the calculation of the 
average scholarship and average percentile rating on the psychological 
examination of the various organizations in our college. The data for 
such a comparison are brought together in Table VII, in which two 
methods of ranking these organizations are revealed. In columns 3 and 
4 are the rankings of the several organizations irrespective of the type 
of organization. The correlation between scholarship and the psycho- 
logical test ratings for these two columns is .68 (Table V). In columns 
5 and 6 the organizations are ranked in comparison with other organi- 
zations of the same type. The correlation between scholarship and rank 
in the psychological examination ratings for sororities is .78 and for 
clubs and societies it is .75. 


* Honor students=Students earning A in all subjects for a given quarter. 


| 
| 
| 
| a 
‘id 
| | | | | | | | 4 
| 
| 
| 
| 
a 


68 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE VII. SCHOLARSHIP AND PSYCHOLOGICAL EXAMINA- 
TION RANKINGS OF ORGANIZATIONS AT THE INDIANA 
STATE TEACHERS COLLEGE 


Rank of total 
In schol-| In In In 
Organization Number} arship | average | scholar- | average 
of y reg- | psycho-| ship | psycho-| Q; | Me-| Qs | Aver- 
members| istrar’s | logical | by re; logical dian age 
office | rating | istrar’s | rating 
office 
(1) (2) (3) (4) (5) (6) (7) | (8) | (® | (10) 
SORORITIES— 
Lambda Delta Phi....| 24 1 6 1 2 | 60} 75) 88] 70 
Gamma Gamma....... 23 11 7 2 3 | 58] 78 | 87 | 68 
31 | 13 5 3 1 | 55 | 80| 92] 71 
Kappa Kappa......... 25 15 9 4 4 | 43 66 | 85 | 64 
Epsilon Delta......... 14 19 23 5 7 | 86) 54| 72) 52 
0% 24 20 26 6 -8 | 34] 48 | 64] 49 
Delta Sigma.......... 16 | 22 27 7 9 | 19} 38| 70 | 48 
Omega Sigma Chi..... 12 23 15 8 6 | 44| 54] 70) 58 
28 28 14 9 5 | 40] 60 | 77 | 59 
12 29 30 10 10 | 30} 60| 39 
FRATERNITIES— 
Alpha Sigma Tau...... 40 14 22 1 4 | 36] 50| 64] 52 
Chi Delta Chi......... 17 17 11 2 1 | 39} 56| 85 | 61 
23 24 17 3 2 | 82) 57 
Delta Lambda Sigma..| 41 26 21 4 3 | 59 | 73 | 53 
CLUBS AND 
SOCIETIES— 
Mathematics Club..... 16 2 2 1 2 | 78 | 87} 94) 81 
Classical Club......... 13 3 4 2 4 | 53| 78 | 90 | 72 
Sycamores Players....| 16 10 3 6 | 50 | 70 | 80 | 63 
6 5 24 4 13 | 38 | 49 | 52 
Science Club.......... 36 6 18 5 10 | 35 | 65 | 79 | 58 
Social Studies Club....} 20 7 8 6 5 | 40! 70 | 65 
Eclectic Society....... 14 8 3 rj 3 | 49] 80} 92]| 73 
Le Cercle Frangais....| 12 9 1 8 1 | 70 | 90 | 96 | 82 
Music Club............ 64 10 12 9 7 | 40} 60 | 83 | 61 
A 19 12 19 10 11 | 43 | 77 | 56 
Poets Ciub............ 10 16 13 ll 8 | 43) 60 | 86 | 60 
Home Economics Club.| 52 18 20 12 12 | 34| 57] 76 | 55 
W.A.A. Council....... 17 21 28 13 15 | 43 | 48 | 57 | 47 
Athenaeum Society....| 32 25 25 14 14 | 36 | 50] 67 | 51 
Primary Club......... 26 27 29 15 16 | 22) 33] 59 | 42 
Commerce Club....... 105 30 16 16 9 | 58 | 83 | 57 


As a result of these studies we are inclined to believe that there is 
a rather significant relationship between performance on the psycho- 
logical examination taken at the time of the student’s entrance in the 
Indiana State Teachers College and his subsequent scholarship. 


| 

| 

| 

| 

| 


EDUCATIONAL MEASUREMENTS CONFERENCE 


95.43 123.355 158.24 


Am. Co.-1927-Nation 

84.09 114.29 145.28 


16,554 cases 


Total Ind. on Am. Co.-1927 
81.02 112.82 138.03 


Ball 
105.35 137.50 182.08 


Evansville 
86.86 115.22 145.10 


} 
2,385 cases 
370 cases 


11l cases 


Indiana University 
72.10 100.16 131.75 


1,257 cases 


501 cases 


1,473 cases 


15 cases 


10 cases 


I.8.T.C. 
96.44 123.38 155.10 
Purdue 
123.75 172.50 
St. Benedict 
73.75 115. 155. 
St. Francis 
125 165 195 
St. Joseph 
115.35 153.75 190.31 


St. Mary of the Woods 


69 


Oo 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 


SCORE 


Fic. 1. The range, median, Q; and Q; on the American Council Exam- 
ination for 1927 as given in Indiana colleges in the fall of 1929. 


| 
| 
| : 
| 
| 
| 8 cases 
| 
| 
| 
| 
83 cases 
| 


70 BULLETIN OF THE SCHOOL OF EDUCATION 
P.U. 
320 
| 
V.U. 
290 J B.s. I.S. 
280 | T.c. 
F.C. 


I.U. -- Indiana University 
P.U. -- Purdue University 


B.S.T.C. -- Ball State 
Q, Teachers College 
I.S.T.C. -- Indiana State 


Teachers College 
F.C. -- Franklin College 
V.U. -- Valparaiso University 


= 


Fic. 2. The range, the median, and the Q: and Q; for each of the 
schools listed above. (Data obtained from giving the 1930 edition 
of the American Council Examination to entering students, 1930) 


It was indicated earlier in this paper that efforts had been made to 
bring together the results of the psychological testing program in the 
teacher-training institutions of the state. Figure 1 is taken from the 
report of the results of the psychological examination given in the fall 
of 1929 and Figure 2 is taken from the report of the results of the fall 
of 1930. The 1929 data were tabulated by the Division of Research of 
Indiana State Teachers College and the 1930 data were tabulated by 
Professor E. L. Yeager of the Department of Psychology of Indiana 
University. For purposes of comparison between the raw scores on the 
1930 examination in the state the distributions in Table VIII are pre- 
sented. 


240 

230 

220 

210 

190 

180 \ 

150 \ \ \ \ 

os 

tin V.U. -- 32-293 

70 \\ 
| 60 

40 


EDUCATIONAL MEASUREMENTS CONFERENCE 71 


TABLE VIII. FREQUENCIES OF RAW SCORES OF ALL ENTER- 
ING STUDENTS AT INDIANA, PURDUE, TERRE HAUTE, 
MUNCIE, FRANKLIN, AND VALPARAISO ON A. C. E. TEST, 
1930 EDITION, TOGETHER WITH THE PERCENTILE VALUES 
OF THE SCORES 


Terre 
Indiana | Purdue | Haute | Muncie |Franklin| Val- 
(State) | (Ball) paraiso 
Class interval 
Per Per Per Per Per Per 
F |cent} F F |cent] F F |eent] F |eent 
5} 100} 11) 100 4) 100 
13} 99} 26) 99} 5] 100} 100) ...| . 2} 98 
24; 98 49) 97 1} 99 4 100 5} 100 3| 97 
49| 96) 96) 94 11) 99 8| 99 3) 95) 10) 96 
62} 92) 127; 87| 19) 97) 97; 2) 91) 14 91 
89} 87| 151) 79| 32) 94) 34) 92) 10) 89 15) 84 
140; 79) 169) 68) 53) 63) 85 4, 79 16) 77 
67) 188; 57| 57) 71) 71) 16) 74 30) 69 
154; 55) 200) 44) 75) 70 7| 56} 12) 54 
158} 186) 31) 82) 57| 65) 42) 17| 45) 25) 42 
158} 28] 145} 18] 80} 43] 66} 28] 17| 27] 29) 30 
109} 15} 83) 8 84 30) 44 14, 8 9 21] 16 
56 6| 27 2} 66) 15) 19 5 ihe 9 6 

132.66 | 149.10 | 109.88 | 131.34] 128.33 | 132.92 
95.09 | 111.16 73.63 95.85 98.23 92.62 
172.68 | 192.75 | 150.78 | 165.56 | 162.50 | 175.94 


The results of our studies at the Indiana State Teachers College and 
our contact with studies made in other teacher-training institutions lead 
us to believe that there are certain advantages to be attained by a con- 
tinued use of the examination and a further study of the results. It 
seems evident from the work which Mr. Tilson has been doing in the 
department of music that it would be possible to develop a plan of guid- 
ance of the students in their selection of the various curricula. We be- 
lieve that a wider dissemination of the information attained from the re- 
sults of these tests could be used to influence the guidance of students 
toward various college careers, when considered as one factor in such a 
guidance program. 

The teachers colleges of the United States are very much concerned 
with studying the problems of the individual student. There was or- 
ganized in Chicago in March of this year an organization to be known 
as the Teachers College Personnel Association. This association is con- 
cerned with the carrying on of codperative studies of personnel work in 
the teachers colleges of the United States. 

Five problems have been undertaken by the association for imme- 
diate study. The first of these problems has for its purpose the carry- 


| 
| 
| 
| 


72 BULLETIN OF THE SCHOOL OF EDUCATION 


ing on of research in an attempt to make and standardize an intelligence 
test that would be particularly adapted for use in teachers colleges. As 
an initial step in this coéperative program the association is recommend- 
ing to the members in good standing of the American Association of 
Teachers Colleges the use of the American Council psychological exam- 
inations for the fall of 1931. 

The second problem to be considered by the association has for its 
objective the construction and standardization of achievement tests for 
use in studying incoming freshmen. 

The third problem has for its objective the construction and use of 
teachers college personnel records for the use of the teachers colleges in 
keeping all records. 

The fourth problem has for its objective the construction and stand- 
ardization of a teaching-aptitude test. All attempts that have been made 
up to date to develop an aptitude test will be studied. An effort will 
then be made to produce for the teachers colleges a teaching-aptitude 
test to be used in the selection of freshmen. 

The fifth problem has for its objective the construction of a pro- 
fessional test for teachers college seniors to be used in codperation with 
the American Council on Education in its nation-wide program of test- 
ing college seniors in May, 1932. 

While membership in the Teachers College Personnel Association is 
limited to members in good standing of the American Association of 
Teachers Colleges, those of us interested in the development of these 
problems wish to extend an invitation to all public school people inter- 
ested in any phase of the program outlined to codperate to the fullest 
extent. The teachers colleges are vitally concerned with a careful study 
of the problem of teacher education in the United States and welcome 
constructive criticisms of their programs at any time. 


| 


| 
| 
! 
| 


Predicting Scholarship in the Junior 
High School 


Fow Ler D. Brooks, Chairman of the Departments of Education and 
Psychology, DePauw University 


PREDICTION and control are highly important practical values of 
scientific work along all lines. The development of science increases 
man’s knowledge of cause and effect, enables him to forecast the out- 
comes of many events, and gives him greater control over his environ- 
ment. Educational measurements, like the other branches of science, 
yields data of value in predicting certain features of human behavior, 
and makes possible its more effective control thru wise guidance and 
direction. As a science of education develops we may expect rich returns 
in knowledge which will enable us to forecast with increasing precision 
various important phases of human activity and endeavor; and, by so 
doing, it will enable us to provide more effective programs of educa- 
tional and vocational guidance. 

In this paper we are considering the prediction of scholarship in the 
junior high school,—scholarship being defined very largely in the present- 
day narrow academic sense. High school principals, superintendents of 
schools, and others engaged in the organization and administration of 
schools have need for adequate means of prophesying scholastic success 
of pupils at various stages in their progress thru the schools. Accurate 
prediction of scholarship is needed to enable the school to adapt its edu- 
cational demands and instructional activities to pupil ability,—a primary 
principle of mental hygiene in schools. We are generally agreed that the 
school should be a place where pupils and teachers live happily and 
effectively together while engaged upon socially and individually useful 
activities. To make it such a place several things must be done. We 
must construct the proper sorts of curricula; we must use suitable meth- 
ods of instruction; and we must find out a great deal about how much 
we reasonably can expect each child to do. Overworking children with 
its consequent strain and stress does not promote the mental health of 
boys and girls. Allowing children to loaf and dawdle along, forming 
habits of idleness and laziness, is not conducive to their present and 
future effectiveness and mental well-being. Relatively accurate forecasts 
of scholarship are of value for placement and sectioning, and have con- 
siderable significance in making the schools’ demands proportionate to 
the pupils’ abilities. 

We realize clearly, however, that, as our conceptions of education 
change, we may expect a different emphasis upon scholarship as we now 
know it. To the extent, for example, that creative activities and broader 
social development are regarded more highly, academic scholarship as 
now conceived in many schools may come to have less educational sig- 
nificance. The investigations which we are reporting relate to the pre- 
diction of academic success or scholarship in the junior high school. 


(73) 


| 
| 
| 
> 
ef 
2 
| 


74 BULLETIN OF THE SCHOOL OF EDUCATION 


An extensive literature is developing on the prediction of scholarship 
in secondary schools.’ The writer studied the value of several factors 
for prophesying the scholastic success of pupils in the first year of the 
junior high school.* Nine group intelligence tests, the Stanford Binet 
Intelligence Test, five group achievement tests, chronological age, pre- 
vious school marks, and teachers’ estimates of intelligence and effort 
were investigated. Two criteria of scholarship were used: (1) the aver- 
age of each pupil’s marks for both semesters in mathematics, history, 
geography, and English; and (2) a composite comprising the average 
used in (1) and his scores on the Stanford Achievement Tests. The sec- 
ond criterion is the better one, and is used in making all comparisons. 
The conclusions from the correlation data of that study are as follows: 

Sixth-grade marks correlate .70 with the composite scholarship cri- 
terion for 93 pupils who had been in junior high school for one year. 
Raw scores from nine group intelligence tests give an average correla- 
tion of .57 with this criterion (ranging from .33.to .66); group-test men- 
tal ages yield correlations averaging .55 (ranging from .33 to .65) ; group 
I. Q.’s have an average correlation of .67 (ranging from .50 to .74). Five 
achievement tests have an average correlation of .50 with this second 
criterion of scholarship (ranging from .38 to .62). 

Chronological age correlates —.49 with scholastic success in the 
first year of junior high school. 

The multiple correlations between scholarship in the first year of 
junior high school and the combination of sixth grade marks and intel- 
ligence quotients from one group intelligence test average .79 (ranging 
from .72 to .83); that is, using a group intelligence test with sixth-grade 
marks raises the correlation with junior high school scholarship an 
average of 9 points, i. e., from .70 to .79. One of the achievement tests 
adds about 6 points, the multiple correlations averaging .76 and ranging 
from .72 to .79. 

Chronological age adds two-thirds as much to the predictive value 
of sixth-grade marks as does a group intelligence test and, on the aver 
age, slightly more than does one of the five achievement tests used.* The 
multiple correlation of sixth-grade marks and chronological age with the 
criterion is .76. 

By adding a second group intelligence test or an achievement test, 
such as the Holley Sentence Vocabulary Test, to the combination of 
sixth-grade marks and a group intelligence test, we get a multiple cor- 
relation of .84 or .85 with first-year scholarship. 

It is better to use sixth-grade marks and one group intelligence test 
than to use two group intelligence tests. In the former case we get mul- 
tiple correlations averaging .79; in the latter case, .76. Sixth-grade 
~~ 4 8ee, for example: Brooks, F. D. Psychology of Adolescence. New York City, 
Houghton, Mifflin Company, 1929. Bibliography at the end of Chapter 17. Also 


Symonds, Percival Mallon. Measurements in Secondary Education. New York City, The 
Macmillan Company, 1929. Chapter 19. 

? Brooks, F. D. “Sectioning Junior High School Pupils by Tests and School Marks.” 
Journal of Educational Research 12:359-69, April, 1925. 

% We used the Stanford Achievement battery as part of the criterion of scholarship, 
and so did not determine its predictive value. We feel sure, however, from other in- 
formation about it, that its predictive value would be high. 


| 
| 


EDUCATIONAL MEASUREMENTS CONFERENCE 75 


marks and chronological age, or one of 


the achievement tests, have as 


high predictive value as the intelligence quotients from two group in- 


telligence tests. (See Table I.) 


TABLE I. CORRELATIONS SHOWING THE VALUE OF CERTAIN 
FACTORS FOR PREDICTING SCHOLARSHIP IN FIRST YEAR 
OF JUNIOR HIGH SCHOOL. N = 93* 


Measure of scholarship 
Combined marks in 
Marks in Grade VII and 
Grade VII| Stanford Achievement 
Test Scores 
School Marks, Grade VI.................. 676 .696 
School Marks, Grades V and VI........... 683 . 636 
Teachers’ Estimates of Intelligence and 

Effort: First Month in Grade VII....... 541 .618 
Thorndike-McCall Reading Test........... 511 .617 
Holley Sentence Vocabulary Test.......... 462 .579 
Group Vocabulary Test (Binet list)........ 400 503 
Kelly-Trabue Language Completion Test, 

Woody-McCall Arithmetic Test............ 342 383 
Binet Test— 

Group Intelligence Tests: 

Dearborn C 

Dearborn D 

Haggerty Delta 2 
Illinois A 
Miller A 
National A 
Otis Self-Administering A 


* Brooks, F. D. Psychology of Adolescence. p. 560. 


1 
| 
| 
| 
& 


76 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE I—Continued 


Measure of scholarship 


Combined marks in 
Marks in Grade VII and 
Grade VII} Stanford Achievement 
Test Scores 


Pintner Non-Language 


Terman A 
Mean for Intelligence Group Tests 


We also may find out the predictive value of these various factors 
(1) by dividing the pupils into three sections on each one, (2) by divid- 
ing them into three sections according to seventh-grade scholarship, and 
(3) by comparing each of the former with the latter to see which basis 
of sectioning really does give groups which are the most homogeneous in 
achievement. Here again sixth-grade marks form as accurate a basis of 
sectioning as the Stanford-Binet or Terman Group Intelligence Tests, 
and a better basis than the other group tests; but a combination of sixth- 
grade marks and I. Q.’s from a group intelligence test, like Haggerty 
Delta 2, gives very much better results than any one factor alone. (See 
Table II.) 


TABLE II. PER CENT OF AGREEMENT OF SECTIONING BY 
TESTS AND SCHOOL MARKS WITH SECTIONING BY CRI- 
TERION OF SCHOLARSHIP IN FIRST YEAR OF JUNIOR 
HIGH SCHOOL: THREE-FOLD GROUPING. N=93* 


Per cent Per cent Per cent 
Basis of sectioning correctly | displaced | displaced 
sectioned | one section | two sections 
Dearborn C (I. 54.8 43.0 2.2 
52.7 40.8 6.5 


* Ibid., p. 561. 


EDUCATIONAL MEASUREMENTS CONFERENCE 77 


TABLE II—Continued 


Per cent Per cent Per cent 
Basis of sectioning correctly | displaced | displaced 
sectioned | one section |two sections 
Mean, Group Intelligence Tests. ...... 55.3 39.4 5.2 
Thorndike-McCall Reading Test...... 50.5 40.9 8.6 
Woody-McCall Arithmetic Test....... 52.7 34.4 12.9 
Holley Vocabulary Test.............. 45.2 43.0 11.8 
Kelly-Trabue Language Completion 
Group Vocabulary (Binet list)........ 47.3 40.9 11.8 
Mean, Achievement Tests............. 47.1 40.9 12.0 
Stanford-Binet (1.Q.)................. 64.5 30.1 5.4 
Chronological Age.................... 51.6 40.9 7.5 
Sixth-Grade Marks................... 63.4 34.4 2.2 
Combination of Sixth-Grade Marks and 


A composite of sixth-grade marks (or fifth- and sixth-grade marks) 
and ratings on a good group intelligence test give better prediction of 
average marks in the first year of high school than do any other two 
factors upon which we have reliable data. We would expect, however, a 
battery of achievement tests, such as the Stanford or Public School 
Achievement Tests, to have high predictive value. 

We also carried on an investigation for three years in a large junior 
high school in Baltimore, Maryland, and have data on a class of ap- 
proximately three hundred pupils which we have not reported previ- 
ously. We checked the prognostic value of the following factors: 

1. Chronological age. 

2. Fifth-grade average marks. 

3. Sixth-grade average marks. 

4. Teachers’ estimates of intelligence (3 teachers for each pupil by 

the end of first month in 7B) upon a seven-fold basis. 

5. Illinois Intelligence Test,—mental age. 

6. National Intelligence Test,—mental age. 

7. Thorndike-McCall Reading Test. 

8. Woody-McCall Reading Test. 

9. Average school marks for each half-year and for each year of 

the three years of junior high school. 
10. School marks in each subject for each half-year and for each 
year thruout the three years of junior high school. 


Our criterion of scholarship is the school marks assigned by the 
junior high school teachers. So much has been said about the unreli- 
ability of teachers’ marks, and they have been so generally berated by 
workers in the field of measurements, that many would object to our 
using them as a criterion of scholarship. 

Students of educational measurements are familiar with the classic 
investigations of Starch and Elliott in which many high school teachers 


——— 


78 BULLETIN OF THE SCHOOL OF EDUCATION 


graded a geometry paper, an English paper, or a history paper. The 
wide variation in marks was noteworthy. Thus 116 geometry teachers 
in North Central Association high schools graded a geometry paper. The 
marks ranged from 28 to 92, indicating that the mark given the paper 
was more a function of who graded the paper than of what the pupil 
wrote on it. The unreliability of marks assigned to a set of papers has 
been taken as evidence of the unreliability of term marks, altho it should 
be seen readily that discovering the unreliability of the former does not 
prove that the latter also are unreliable. Term marks are based upon 
daily recitations, informal quizzes, and special assignments and reports, 
and we now very commonly find high-school teachers using objective, 
new-type examinations. 

On a priori grounds we would expect term or quarterly marks to be 
much more reliable than marks on a single paper or set of papers, just 
because the former is a composite. The Spearman-Brown prophecy for- 
mula gives a rough index of the higher reliability of a composite over its 
components, and gives support to our view of the greater reliability of 
term marks. In addition to these a priori considerations, some experi- 
mental evidence has been discovered indicating tha; teachers’ term marks 
have much reliability. Dr. J. Carey Taylor, now assistant superintend- 
ent of schools in charge of secondary schools in Baltimore, Maryland, in 
his doctor’s dissertation (completed under the writer’s direction at Johns 
Hopkins University in June, 1930) shows that quarterly marks in the 
junior high school have reliability coefficients ranging from the .80’s up 
to .95,—far higher than most workers in measurements would expect 
them to be. 

We have calculated more than four hundred simple coefficients of 
correlation between each of the various factors whose predictive value 
we sought and the different criteria of scholarship. We have also cal- 
culated more than 2,100 multiple coefficients of correlations between the 
criteria of scholarship and various pairs of predictive factors. Both the 
simple and multiple correlations show a wide variation,—from low to 
reasonably high or high. Without going into detail, these coefficients 
seem to indicate the following conclusions for the data used: 


1. Predicting average marks in the first year of junior high school. 


a. Teachers’ estimates of intelligence have the highest predictive 
value of any one factor, the correlation being .72; average sixth-grade 
marks rank second, with a correlation of .61; the other factors show cor- 
relations of less than .50, that of chronological age being —.43. Reading 
and arithmetic tests showed correlations of less than .40. 

b. Sixth-grade marks and teachers’ estimates of intelligence form 
the best two-factor combination, the multiple correlation being .79. All 
multiple correlations involving teachers’ estimates of intelligence are 
above .70, as would be expected from the magnitude of the simple cor- 
relation involving teachers’ estimates of intelligence. Fifth- and sixth- 
grade marks correlate .74 with first-year scholarship. Multiple correla- 
tions of any two of the following tests—lIllinois M. A., National M. A., 
Thorndike-McCall Reading, and Woody-McCall Arithmetic—are .50 or 
less. 


EDUCATIONAL MEASUREMENTS CONFERENCE 79 


2. Predicting average marks in second and third years of the junior 
high school. 


Three courses are pursued by students during their second and third 
years in the junior high school in which our investigation was carried 
on: academic, technical, and commercial. Most of the boys in the tech- 
nical course entered a technical high school at the end of the second year 
of the junior high school, so that we have too few data for their third 
year to be of any value. 

a. The best single factor to predict second-year average is first- 
year average, the correlations being .88, .90, and .72 in the academic, 
technical, and commercial courses, respectively. Third-year average is 
best predicted by second-year average in the academic and commercial 
courses, the correlations being .78 and .58, respectively. Sixth-grade 
marks, teachers’ estimates of intelligence, and fifth-grade marks corre- 
late around .55 to .65 with second-year average marks. The National 
Intelligence Test correlates .53 with second-year average in the technical 
course; the Woody-McCall Arithmetic Test, .50; and chronological age, 
—.55. All the other single factors correlate less than .50. 

b. The only two-factor combination which is materially better than 
first-year average for predicting second-year average in the academic 
course is the combination of first-year average and chronological age, 
the multiple correlation being .91. In the technical course first-year 
average and either the Woody-McCall Arithmetic Test or fifth-grade 
average marks yield a multiple correlation of .92 with second-year aver- 
age. In the commercial course second-year average is predicted slightly 
more accurately by some two-factor combination such as first-year aver- 
age and (1) sixth-grade marks, (2) teachers’ estimates of intelligence, 
(3) Illinois M. A., or (4) National M. A. than by first-year average 
alone. In the academic course none of the two-factor combinations is ma- 
terially better than second-year average marks, no multiple correlations 
being as high as .80. 

8. Predicting scholarship in the various subjects in the junior high 
school. 

a. Teachers’ estimates of intelligence and sixth-grade marks are 
the best single factors for predicting scholarship in the first year of 
junior high school, the correlations ranging from .45 to .69 with most of 
them more than .57. The first-year average or the first-year mark in a 
subject and the same factors in the second year are the best single fac- 
tors for forecasting scholarship in any subject in the second and third 
years, respectively; combining them gives the best two-factor combina- 
tions; but the correlations are much lower in the commercial course than 
in either of the other two,—in fact, the factors we have used are of little 
value in prophesying success in the strictly commercial subjects of the 
commercial course. 


Conclusions 


1. If a school system needs to know what factors best predict schol- 
arship in the junior high school, it should study the problem in its own 
schools and find them out. In doing this great care must be taken to 
insure reliable and valid results. 


i 


80 BULLETIN OF THE SCHOOL OF EDUCATION 


a. Valid criteria must be set up. 

b. Adequate analysis must be made so as to isolate all important 
factors and thus avoid the dangers of the unconscious assumptions which 
follow in the wake of inadequate analyses. 

c. The investigation should cover several years, each treated sep- 
arately, to show trends, instead of putting data together for several 
years. This latter procedure gives a larger number of cases for the cor- 
relations, but does not reveal what the correlations are for any one year 
or class. If the school is small, the data may be combined for every two 
or three years. 

2. The writer’s investigations in several junior high schools in 
Baltimore, Maryland, and the investigations of some of his graduate 
students at Johns Hopkins University in other junior high schools indi- 
cate that scholarship as we now define it is best predicted by a combina- 
tion of two or more factors such as: 

a. Previous scholarship as measured by 


(1) School marks or 
(2) Batteries of achievement tests such as the Stanford or Public 
School Achievement Tests. 

b. Mental ability as indicated by 

(1) Group intelligence test results or 

(2) Teachers’ estimates of intelligence. 

c. Scores on certain aptitude tests as those for Latin, algebra, and 
modern foreign language. 

3. If previous school marks are used, their predictive value can be 
increased by finding out what adjustments are necessary to render the 
marks comparable from feeder schools which have different standards. 
This is a problem which needs careful study. It is possible by appro- 
priate techniques to increase the predictive value of marks from different 
schools as shown in a study of high school and college marks carried on 
by Dr. Hawks under the writer’s direction at Johns Hopkins University 
in 1929. 


4. As our conceptions of education change,—especially, as we place 
greater emphasis upon creative activities and a broader social training, 
we will need evaluations of these and other factors which may be of value 
then in helping the school arrange its demands according to pupils’ needs 
and abilities. 


List of Bulletins 
in the Field of Education, 
Indiana University 


The following is a list of the bulletins published by the School of 
Education, Indiana University. 


All bulletins which are available at the present time can be secured 
thru the University Bookstore for fifty cents ($0.50) per copy, with the 
exception of the Second Revision of the Bibliography of Educational 
Measurements, the Bibliography on School Buildings, Grounds, and 
Equipment, and The Philosophy of Human Relations—Individual and 
Collective, which can be secured through the Bureau of Codperative Re- 
search, School of Education, for seventy-five cents ($0.75) per copy. 

Proceedings of the High School Principals’ Conference (November 
23 and 24, 1923). Vol. I, No. 1, 1924. 85 p. (Supply exhausted.) 

Investigation of Nursing as a Professional Opportunity for Girls. 
Part I, Technical Study; Part II, Vocational Information Bulletin. By 
Florence E. Blazier. Vol. I, No. 2, 1924. 69 p. 

Proceedings of the Eleventh Conference on Educational’ Measure- 
ments. Vol. I, No. 3, 1925. 141 p. 

Proceedings of the High School Principals’ Conference (November 
14 and 15, 1924). Vol. I, No. 4, 1925. 49 p. (Supply exhausted.) 

First Revision of the Bibliography of Educational Measurements. 
Compiled by the Bureau of Coéperative Research. Vol. I, No. 5, 1925. 
147 p. (Supply exhausted.) 

Proceedings of the Twelfth Conference on Educational Measure- 
ments. Vol. I, No. 6, 1925. 76 p. 

The Effect of Population Upon Ability to Support Education. By 
Harold F. Clark. Vol. II, No. 1, 1925. 28 p. 

Proceedings of the High School Principals’ Conference (November 
20 and 21, 1925). Vol. II, No. 2, 1925. 77 p. (Supply exhausted.) 

A Cross-Indexed Bibliography on School Budgets. By Harold F. 
Clark. Vol. II, No. 3, 1926. 66 p. 

A Comparison of the Results Made on Certain Standardized Tests 
by Pupils in the Bloomington High School Who Were Taught in Classes 
of the Same Grade by University Student Teachers and by Regular High 
School Teachers. By Carl G. F. Franzén. Vol. II, No. 4, 1926. 19 p. 

Proceedings of the Thirteenth Annual Conference on Educational 
Measurements. Vol. II, No. 5, 1926. 103 p. 

When to Issue School Bonds. By Harold Florian Clark and Paul 
Royalty. Vol. II, No. 6, 1926. 16 p. 

Students’ Attitude Toward Examinations. By Grover T. Somers. 
Vol. III, No. 1, 1926. 48 p. 

Proceedings of the High School Principals’ Conference (November 
12 and 13, 1926). Vol. III, No. 2, 1926. 27 p. 


(81) 


3 
= 


82 BULLETIN OF THE SCHOOL OF EDUCATION 


Index Numbers in School Administration. By Harold F. Clark. 
Vol. III, No. 3, 1927. 35 p. 

Topical Analysis of 234 School Surveys. Compiled by the Bureau 
of Codperative Research. Vol. III, No. 4, 1927. 111 p. (Supply ex- 
hausted.) 

Proceedings of the Fourth Annual Conference on Elementary Super- 
vision. Vol. III, No. 5, 1927. 64 p. 

Proceedings of the Fourteenth Annual Conference on Educational 
Measurements. Vol. III, No. 6, 1927. 66 p. 

Some Phases of the Junior College Movement. By I. Owen Foster, 
Harold F. Clark, Willard W. Patty, and Leo M. Chamberlain. Vol. IV, 
No. 1, 1927. 125 p. (Supply exhausted.) 

Second Revision of the Bibliography of Educational Measurements. 
By Henry Lester Smith and Wendell William Wright. Vol. IV, No. 2, 
1927. 251 p. 

Bibliography on School Buildings, Grounds, and Equipment. By 
Henry Lester Smith and Leo Martin Chamberlain. Vol. IV, No. 3, 1928. 
326 p. 

Proceedings of the High School Principals’ Conference (November 
18 and 19, 1927). Vol. IV, No. 4, 1928. 54 p. 

The Economic Effects of Education. By Harold F. Clark. Vol. IV, 
No. 5, 1928. 39 p. 

Proceedings of the Fifteenth Annual Conference on Educational 
Measurements. Vol. IV, No. 6, 1928. 73 p. 

Proceedings of the Fifth Annual Conference on Elementary Super- 
vision. Vol. V, No. 1, 1928. 54 p. 

Proceedings of the High School Principals’ Conference (November 
16 and 17, 1928). Vol. V, No. 2, 1928. 33 p. 

The Development and Use of a Composite Achievement Test. By 
Wendell William Wright. Vol. V, No. 3, 1929. 90 p. 

An Analysis of the Attitudes of American Educators and Others 
Toward a Program of Education for World Friendship and Understand- 
ing. By Henry Lester Smith and Leo Martin Chamberlain. Vol. V, No. 
4, 1929. 109 p. 

Tentative Program for Teaching World Friendship and Understand- 
ing in Teacher Training Institutions and in Public Schools for Children 
Who Range From Six to Fourteen Years of Age. By Henry Lester 
Smith and Sherman Gideon Crayton. Vol. V, No. 5, 1929. 54 p. 

Proceedings of the Sixteenth Annual Conference on Educational 
Measurements. Vol. V, No. 6, 1929. 96 p. 

Proceedings of the Sixth Annual Conference on Elementary Super- 
vision. Vol. VI, No. 1, 1929. 73 p. 

An Analysis of the Duties of County School Superintendents and 
Superintendents of Schools in Certain Cities in Indiana. By Henry Les- 
ter Smith and Leo Martin Chamberlain. Vol. VI, No. 2, 1929. 94 p. 

Proceedings of the High School Principals’ Conference (November 
22 and 23, 1929). Vol. VI, No. 3, 1930. 51 p. 

Coéperative Studies in Secondary Education. By Henry Lester 
Smith and Carl G. F. Franzén. Vol. VI, No. 4, 1930. 121 p. 


q 


EDUCATIONAL MEASUREMENTS CONFERENCE 83 


Proceedings of the Seventeenth Annual Conference on Educational 
Measurements. Vol. VI, No. 5, 1930. 103 p. 

Proceedings of the Seventh Annual Conference on Elementary 
Supervision. Vol. VI, No. 6, 1930. 102 p. 

A Study of Teacher Supply and Demand in Indiana. By I. Owen 
Foster, Robert K. Devricks, Harry N. Fitch, Earl C. Bowman, and 
George L. Roberts. Vol. VII, No. 1, 1930. 77 p. 

Proceedings of the High School Principals’ Conference (November 
7 and 8, 1930). Vol. VII, No. 2, 1930. 70 p. 

The Philosophy of Human Relations—Individual and Collective. By 
Henry Lester Smith and Harold Littell. Vol. VII, No. 3, 1931. 326 p. 

The Psychology of Human Relationships: Individual and Social. By 
Henry Lester Smith and Levi McKinley Krueger. Vol. VII, No. 4, 1951. 
107 p. 


| 
- 
{ 
} 
3 
a 
| 


WILLIAM B. BURFORD PRINTING CO. 
INDIANAPOLIS 


Bs 


